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Abstract 

A  separate  compiler  independently  translates  a  program's  compo¬ 
nents  in  a  way  that  preserves  correctness  of  the  program  as  a  whole. 
This  dissertation  develops  techniques  and  tools  for  verified  (mechani¬ 
cally  proved)  separate  compilation  of  programs  in  C. 

Specifying  and  proving  separate  compilation  for  C  is  made  chal¬ 
lenging  by  the  coincidence  of:  compiler  optimizations,  such  as  register 
spilling,  that  introduce  compiler-managed  (private)  memory  regions 
into  function  stack  frames,  and  C's  stack-allocated  addressable  local 
variables,  which  may  leak  portions  of  stack  frames  to  other  modules 
when  their  addresses  are  passed  as  arguments  to  external  function 
calls.  The  CompCert  compiler,  as  built/ proved  by  Leroy  et  al.  2006- 
2015  and  upon  which  this  dissertation  builds,  has  proofs  of  correct¬ 
ness  for  whole  programs,  but  its  simulation  relations  are  too  weak  to 
specify  or  prove  separately  compiled  modules. 

The  main  contributions  of  the  dissertation  are: 

(i)  language-independent  linking,  a  new  operational  model  of  mul¬ 
tilanguage  module  interaction  that  supports  the  statement  and  proof 
of  cross-language  contextual  equivalence; 

(ii)  structured  simulations,  a  program-equivalence  proof  method 
that  enables  expressive  module-local  invariants  on  the  state  communi¬ 
cated  between  compilation  units  at  runtime; 

(iii)  the  application  of  the  above  techniques  to  Compositional  Comp- 
Cert,  a  verified  separate  compiler  for  C.  As  additional  validation,  the 
dissertation  demonstrates  the  connection  of  Compositional  CompCert 
to  the  Verifiable  C  program  logic. 
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Chapter 


1  - 

Introduction 


C  is  a  language  of  contradictions.  On  the  one  hand,  it  was  designed  [KR98] — 
and  excels  at — giving  the  programmer  fine-grained  control  over  byte-level 
data  representations  as  they  are  laid  out  in  memory  Low-level  control 
simplifies  the  construction  of  software  components  like  operating  systems, 
device  drivers,  and  other  resource-constrained  systems  for  which  a  garbage- 
collected  language  is  unsuitable. 

At  the  same  time,  the  C  specification  [ISOll]  fights  to  maintain  a  mini¬ 
mum  of  abstraction,  if  only  to  preserve  the  sanity  of  C  programmers  and 
compiler  writers.  This  "C-level  abstraction"  does  more  than  just  circum¬ 
scribe  control  flow  (to  function  call/return  and  local  jumps);  it  imposes  an 
abstraction  layer  over  data  as  well,  including: 

•  the  distinction  of  pointers  from  integers  (in  particular,  casting  pointers 
to  integers,  and  vice  versa,  is  only  implementation-defined  [ISOll, 
6.3.2. 3]).  This  distinction  runs  counter  to  the  intuitions  of  many  C 
programmers,  who  often  assume  general  pointer-integers  casts  are 
portable  across  implementations. 

•  the  notion  of  memory  object  [ISOll,  3.15],  as  distinct  from  the  underly¬ 
ing  bit-  or  byte-level  representation  of  that  object.  For  example,  pointer 
arithmetic  and  comparison  in  C  across  memory  objects — the  run¬ 
time  representations  of  language-level  constructs  such  as  (addressed) 
variables — is  undefined  [ISOll,  6.5.6].  Contrast  with  assembly  lan¬ 
guage,  in  which  no  such  distinction  exists. 

•  (weak)  typing,  in  the  form  of  type  tags  (e.g.,  int  or  float*)  ascribed 
to  memory  objects.  Compilers  require  types  for  register  allocation  and 
stack-frame  management,  and  take  advantage  in  alias  analyses  of  the 
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"strict  aliasing"  condition  [ISOll,  6.5],  which  asserts  that  two  pointers 
of  incompatible  types  never  alias. 

•  object  lifetime.  [ISOll,  6. 2.4.6]  Certain  memory  objects  have  lifetime 
that  is  block-scoped.  Block  scoping  gives  compiler  writers  the  free¬ 
dom  to,  e.g.,  reuse  an  object's  storage  once  the  object's  lifetime  has 
ended.  The  lifetime  of  a  malloc' d  region  extends  to  the  point  (in 
the  program  execution)  at  which  the  region  is  deallocated,  giving  the 
malloc  implementation  the  freedom  to  reuse  the  freed  region. 

These  abstractions — pointers /integers,  objects,  weak  typing,  lifetime,  and 
others — are  more  than  just  convenience.  They  fundamentally  enable  com¬ 
pilers  to  do  their  work.  In  addition  to  strict  aliasing,  which  facilitates  alias 
analysis,  the  correctness  of  compiler  phases  that  reorganize  memory  layout, 
such  as  stack-frame  allocation,  register  allocation/ spilling,  function  inlin¬ 
ing,  and  stack  reuse  optimization,  depends  deeply  on  whether  the  program 
contexts — in  which  the  compiled  code  will  run — respect  the  C  language 
abstraction.  I  illustrate  this  point  with  multiple  examples  in  the  second  half 
of  the  introduction. 

Well-defined  C  programs  are  of  course  valid  program  contexts  (they  are, 
by  definition,  C-abstr action-preserving).  But  a  compiler,  and  its  correctness 
proof  if  one  exists,  should  generalize  beyond  pure  C.  Real  software  systems 
like  operating  systems  contain  components  written  in  multiple  languages 
(e.g.,  C  and  assembly),  in  order  to  perform  tasks  at  varying  levels  of  ab¬ 
straction.  For  example,  an  OS's  scheduler  might  be  written  in  C  while  its 
process  switcher  and  interrupt  handlers  are  written  in  assembly  language. 
While  some  assembly-language  modules  will  respect  C-level  abstractions 
(at  the  points  of  interaction),  others  will  not.  The  first  question  this  thesis 
answers  is. 

Under  which  program  contexts  (assembly  or  otherwise)  is  an  optimiz¬ 
ing  C  compiler  guaranteed  to  preserve  program  behavior? 

Specific  related  technical  questions  include: 

How  to  give  semantics  to  open  modules  (those  that  call  functions  de¬ 
fined  in  other  translation  units)?  Answering  this  question,  and  the 
related  How  does  one  reason  about  equivalence  of  open  modules?  is 
necessary  to  state  (and  prove)  correctness  of  a  separate  compiler. 

How  to  achieve  language  independence?  Our  compiler-correctness 
theorems  should  apply  regardless  of  the  language  in  which  pro¬ 
gram  contexts  are  implemented,  assuming  some  basic  semantic 
conditions  on  context  behavior. 
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Do  the  techniques  scale  to  the  complex  features  of  languages  such  as 
C?  For  example,  can  we  handle  addressed  stack-allocated  local 
variables?  Section  1.1  explains  why  such  features  do  not  mix 
smoothly  with  proofs  of  compiler  optimizations  that  transform 
program  memory  layout. 


Do  the  techniques  scale  to  real  systems?  Compatibility  with  exist¬ 
ing  verified  compilers  for  C  such  as  CompCert  is  an  important 
aspect  of  this  work.  Achieving  compatibility  means  significant 
proof  engineering  to  validate  the  techniques  against  the  actual 
compiler  transformations  performed  by  CompCert. 


The  solution  to  the  "which  program  contexts"  question  (Chapter  6)  is 
not  a  new  type  system  or  syntactic  device,  but  a  set  of  semantic  restrictions 
that  circumscribe  the  behaviors  of  valid  program  contexts  in  a  manner  that 
is  independent  of  the  particular  language  in  which  the  contexts  are  imple¬ 
mented.  In  the  second  half  of  this  chapter,  I  further  motivate  by  presenting 
a  number  of  failure  cases:  what  goes  wrong  when  compiled  code  is  linked 
with  contexts  that  break  C-level  abstractions,  in  sometimes  subtle  ways. 

In  order  to  achieve  language  independence,  it  was  necessary  first  to 
define  what  it  means  for  program  modules  in  C  and  assembly  (and  per¬ 
haps  other  languages)  to  interact.  I  solve  this  problem  with  interaction  se¬ 
mantics  (Chapter  3),  which  defines  the  interface  of  sequential  (and  well- 
synchronized  concurrent)  threads  in  a  language-independent  manner,  and 
language-independent  linking  (Chapter  4),  which  gives  the  overall  semantics 
of  linked  programs  from  the  interaction  semantics  of  the  underlying  mod¬ 
ules. 

The  final  contribution  of  the  thesis  is  the  application  of  interaction  se¬ 
mantics,  language-independent  linking,  and  the  semantic  restrictions  on 
program  contexts  that  I  develop  in  Chapter  6  to  Compositional  CompCert, 
a  verified  compositional  C  compiler.  [SBCA14]  Compositional  CompCert  ex¬ 
tends  the  correctness  specification /proof  of  Leroy  et  al.'s  CompCert  verified 
C  compiler — which  dealt  only  with  whole  programs — to  separate  compila¬ 
tion.  The  major  additional  technical  advance  in  the  proof  itself  is  structured 
simulations  (Chapter  5),  an  extension  of  Leroy's  simulation  proofs  to  sup¬ 
port  both  rely-guarantee  reasoning  (about  the  program  properties  that  are 
assumed  and  preserved  by  compilation)  and  fine-grained  invariants  on  pro¬ 
gram  state  that  distinguish,  e.g.,  compiler-managed  spilled  registers  from 
programmer-managed  stack-allocated  local  variables. 
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1.1  Motivation 

1.1.1  Verifying  Realistic  Optimizations 

The  correctness  of  common  compiler  optimizations  like  constant  propaga¬ 
tion,  spilling,1  dead-code  elimination,  and  function  inlining  is  sensitive  to 
the  program  contexts  in  which  the  compiled  code  is  executed.  As  example, 
consider  the  following  C  program  fragment: 


int  g ( int  * ) ; 
static  int  f (void) 
int  a  =  0 ; 
return  g ( &a) ; 

} 

int  main (void)  { 
return  f ( ) ; 


main : 


Oxbf  f  f  f  9  9c 


Stack 


The  code  on  the  left  starts  from  main,  which  calls  internal  function  f,  which 
in  turn  calls  external  function  g  (&a)  (declared  in  this  module  but  defined 
by  another  translation  unit),  passing  the  address  of  the  stack-allocated  local 
variable  a  as  argument.  To  the  right  of  the  code,  I  give  a  schematic  represen¬ 
tation  of  the  memory  state  at  the  point  at  which  g  is  called:  The  stack  grows 
downward;  the  outlined  blocks  are  the  activation  records  for  the  calls  to 
main  and  f  respectively. 

So  far,  so  good.  But  consider  for  a  moment  how  the  picture  changes  if 
the  compiler  decides  to  inline  f: 


int  g (int * ) ; 
int  main (void)  { 
int  a  =  0 ; 
return  g  ( &  a) ; 


Oxbf  f  f  f  9ac 


main : 
a=0 


Stack 


Oxbf  f  f  f  9  9c 


In  the  code  on  the  left,  the  (static)  function  f  has  been  removed  entirely 
(it  has  internal  linkage,  and  therefore  could  not  have  been  called  from  an 
external  translation  unit).  In  main,  the  body  of  f  has  been  inlined  at  what 


Spilling,  which  is  performed  after  register  allocation,  moves  temporaries  that 
cannot  be  allocated  in  registers  into  function  activation  records. 
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was  previously  the  call  point.  The  variable  a  is  now  declared  and  initialized 
in  main  rather  than  f. 

In  the  memory  diagram  to  the  right,  the  two  activation  records  for  main 
and  f  have  been  coalesced  into  a  single,  slightly  larger  stack  frame  for 
main.  Because  a  is  stack-allocated  at  function  entry  for  main,  instead  of  f, 
it  is  placed  in  memory  at  a  different  location  than  it  was  previously,  before 
function  inlining  was  performed. 

The  problem  here  is  that  the  function  g  is  now  passed  a  different  pointer 
than  it  was  before  (the  pointer  references  the  same  memory  object,  contain¬ 
ing  the  value  of  variable  a,  but  the  object  itself  has  been  allocated  at  a 
different  address).  A  craftily  constructed  implementation  of  g,  such  as  the 
following  (bad)  C  module: 

int  g  ( int  *  p )  { 

return  ( (uintptr_t ) p==0xbf f f f 99c) ; 

} 

could  in  principle  distinguish  the  two  program  fragments,  pre-  and  post¬ 
function  inlining,  by  cleverly  choosing  the  integer  0xbffff99c  to  equal 
the  address  at  which  a  is  allocated  before  inlining  has  been  performed. 
When  this  implementation  of  g  is  linked  to  the  program  fragment  above, 
C  compilers  such  as  gcc  and  CompCert  produce  programs  that  generate 
different  return  values,  depending  on  whether  function  inlining  is  enabled 
or  not.  This  behavior  does  not  indicate  the  presence  of  a  bug  in  the  com¬ 
pilers.  Instead,  it  is  evidence  that — from  the  perspective  of  the  compiler 
writers — g  is  an  overly  sensitive  program  context;  the  result  of  executing  g 
depends  too  much  on  implementation  details  of  the  translation  unit(s)  it  is 
linked  with. 

Constant  Propagation.  There  is  not  much  a  compiler  writer  can  do  if 
program  contexts  like  g  have  the  power  to  interrogate  memory  at  arbitrary 
addresses.  Such  contexts  break  all  abstraction,  and  therefore  rule  out  most, 
if  not  all,  program  optimizations  that  reorganize  memory  in  nontrivial 
ways. 

There  are  more  subtle  abstraction-breaking  behaviors,  however.  Con¬ 
sider  the  following  C  program  fragment: 

void  g  (int * ) ; 

int  f (void)  { 

int  a;  int  b  =  3; 

g  (&a)  ; 

return  b; 

} 
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Function  f  declares  two  local  integer  variables,  a  and  b.  It  then  calls  external 
function  g,  passing  the  address  of  variable  a  as  argument. 

What  value  does  function  f  return?  The  answer  depends,  of  course,  on 
the  implementation  of  g.  For  example,  if  g  is  the  following  (bad)  C  program: 

void  g(int*  p)  { 

*  (p  +  1)  =  4; 

} 

and  we  compile  and  execute  with  gcc2  at  optimization  level  0,  we  get  result: 

>  gcc  -00  f.c  g.c;  ./a. out;  echo  $? 

>  4 

At  this  optimization  level,  gcc  stack-allocates  both  a  and  b,  in  the  fol¬ 
lowing  configuration: 


Stack 


00 

II 

r> 

\ — 1 

1 

C3 

a 

The  write  to  *  (p  +  1)  in  g  becomes  a  write  to&a  +  1  ==  &b,  which 
overwrites  the  value  of  b  to  4 . 

But  now  consider  what  happens  if,  viewing  f  in  isolation,  we  apply  a 
standard  program  optimization  like  constant  propagation,  resulting  in  the 
new  program: 


int  N  (void)  { 
int  a; 
g (&a) ; 
return  3; 

} 

The  optimized  f '  clearly  has  different  behavior,  when  linked  with  g,  than 
the  original  f  (it  returns  3  instead  of  4).  This  new  behavior  can  be  demon¬ 
strated  by  compiling  the  original  program  at  optimization  level  1: 

>  gcc  -01  f.c  g.c;  ./a. out;  echo  $? 

>  3 


2Version  4.7.2. 
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The  problem,  again,  is  not  the  compiler  (soundness  of  basic  optimiza¬ 
tions  like  constant  propagation  is  uncontroversial)  but  the  context  g.  By 
writing  to  address  &a  +  1,  g  breaks  the  C-language  abstraction.  Variables 
a  and  b  represent  two  distinct  runtime  objects;  pointer  arithmetic  between 
them  is  undefined. 

Stack-Reuse  Optimization.  Compiler  optimizations  can  take  advantage 
of  the  C  object  abstraction  in  even  subtler  ways.  Consider  this  C  program: 

int  g(int*,  int*); 
void  f (void)  { 
int  a; 

{ 

int  b; 

print f ( "%d, " ,  g(&b,&a)); 

} 

{ 

int  c; 

print f ( "%d\n" ,  g(&c,&a)); 

} 

} 

in  which  f  allocates  three  local  variables:  a,  which  has  function  scope,  and 
b  and  c,  which  have  (nonintersecting)  block  scope.  The  function  g  returns 
the  result  of  the  pointer  comparison  (p+1)  ==q: 

int  g(int*  p,  int*  q)  { 
return  (p+l)==q; 

} 

This  program  prints  varied  results  at  multiple  different  optimization  levels. 
For  example,  at  optimization  level  0  the  result  is  “1,  0".  At  level  1  it  is 
"1 ,  1".  At  levels  2  and  3  it  is  "0 ,  0". 

At  optimization  level  0,  gcc  allocates  three  stack  slots  for  the  three 
distinct  variables  a,  b,  c,  resulting  in  stack  configuration: 


Stack 


a 

&  cl  —  1 

b 

\ — 1 

1 

rQ 

C3 

c 
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In  this  configuration,  g  (&b,  &  a)  ==  1,  because  &b  +  1  ==  &a,  whereas 

g  ( &c,  &a)  ==  0. 

At  optimization  level  1,  the  compiler  reuses  b's  stack  slot  for  c  (the  two 
variables  have  nonintersecting  scope),  resulting  in  configuration: 


Stack 


a 

Sc  3.  1 

b,  c 

in  which  variables  b  and  c  have  the  same  address.  The  fact  that  &b==&c 
gives  result  "1,  1".  The  final  output,  "0,  0",  results  from  the  fact  that  at 
higher  optimization  levels,  a  and  b ,  c  are  allocated  on  the  stack  in  reverse 
order. 

The  naive  solution  to  these  problems  is  to  place  limits  on  the  compiler, 
e.g.,  by  disallowing  compiler  optimizations  that  rearrange  memory.  But  this 
is  too  expensive.  Many  standard  compiler  optimizations  would  be  ruled  out, 
including  function  inlining,  dead-code  elimination,  frame-pointer  optimiza¬ 
tion,  constant  propagation,  stack-reuse  optimization,  etc.  A  second  potential 
solution  is  to  prohibit  address-taken  local  variables  (globals,  which  are  typ¬ 
ically  not  rearranged  in  memory  by  compilers,  would  still  be  addressable). 
But  then  we  no  longer  have  C.  Restricting  the  language  is  also  too  syntactic: 
it  does  not  easily  generalize  to  other  programming  languages  besides  C. 

The  semantic  restrictions  on  program  contexts  that  I  define  in  this  thesis 
rule  out  overly  concrete  contexts  like  the  gs  above,  while  still  enabling 
interesting  programs.  The  details  of  the  semantic  restrictions  are  discussed 
in  Chapter  6.  At  a  very  high  level,  they  ensure  that  programs 

•  treat  pointers  abstractly,  by  not  comparing  pointers  with  fixed  integers, 
as  in  the  first  g  above; 

•  respect  the  C  memory  and  object  model,  by  distinguishing  pointers 
from  integers,  and  by  further  distinguishing  pointers  to  distinct  mem¬ 
ory  regions/ objects; 

•  respect  the  interaction  model  imposed  by  the  external  function  call 
protocol  (no  control  flow  aside  from  function  call /return  across  mod¬ 
ule  boundaries). 

The  advantage  of  stating  and  enforcing  these  conditions  semantically,  as  op¬ 
posed  to  syntactically — e.g.,  via  a  type  system  or  by  restricting  in  which  lan¬ 
guages  program  contexts  may  be  implemented — is  the  flexibility  to  model 
program  contexts  in  a  variety  of  languages:  from  C  (Section  3.2.1)  and  x86 
assembly  (Section  3.2.2)  to  Coq's  Gallina  (Section  3.2.3). 
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1.1.2  Specifying  and  Compiling  Open  Programs 

A  presumption  of  the  preceding  is  that  we  at  least  have  a  specification  of 
multilanguage  programs.  By  multilanguage,  I  mean  programs  in  which 
some  components  are  written  in  a  language  like  C  while  others  are  written 
in  assembly,  or  possibly  a  third  language.  As  Perconti  and  Ahmed  [PA14] 
have  also  observed,  multilanguage  semantics  is  useful  not  only  for  program 
understanding,  but  also  as  a  mechanism  for  stating  cross-language  contex¬ 
tual  equivalence — the  compiler  correctness  criterion  I  employ  in  this  thesis. 
What  are  the  difficulties  here? 

Consider  first  the  whole  program  case,  construed  broadly:  Imagine  P$ 
is  a  source  program  (e.g.,  in  C  or  some  other  source  language)  and  PT  the 
assembly  code  produced  by  compiling  P$.  Then  whole-program  compiler 
correctness  states  that  Ps  and  Py  have  the  same  observable  behavior.3  How 
does  one  express  "same  observable  behavior"?  Because  Ps  and  Py  are 
whole  programs,  we  can  prove,  e.g.,  with  respect  to  the  big-step  semantics 
of  the  source  and  target  languages,  that  Ps  Jj-  v  -<=>-  Pt  II  v,  for  all 
v.  Or,  in  a  small-step  semantics,  we  might  show  that  Ps  and  Py  produce 
corresponding  traces  of  observable  events. 

As  Benton  and  Hur  [BH10]  and  Perconti  and  Ahmed  [PA14]  have  both 
remarked,  things  become  more  difficult  when  we  move  from  (closed)  whole 
programs  to  consider  open  modules  (those  that  call  functions  declared  but 
not  defined  in  the  current  translation  unit).  In  this  more  general  setting, 
multimodule  source  programs  Ps,  P's  are  separately  compiled  to  multimod¬ 
ule  targets  Py,  P'T.  To  state  correctness,  we  must  say  that  the  semantics  of 
Ps  and  Py  correspond  in  some  way,  and  likewise  for  P's  and  P'T.  However, 
we  cannot  simply  apply  the  usual  notions  of  program  equivalence  here, 
as  we  did  in  the  whole-program  case.  Because  the  modules  are  program 
fragments  and  not  complete  programs,  we  cannot  "execute"  them  in  any 
meaningful  sense. 

The  other  force  at  play  is  the  need  for  compositionality:  Correctness 
of  the  translation  of  one  unit  should  compose  with  the  correctness  proofs 
of  other  units  to  yield  correctness  of  the  whole  program  translation.  In 
other  words,  from  (independent)  proofs  that  Ps  =  Py  and  P's  =  P'T,  it 
should  be  possible  to  deduce  Ps  ixi  P's  =  Py  do  P't,  for  some  suitable 
notion  of  linking  DO  and  program  equivalence  =.  In  the  most  general  case, 
we  will  support  source  programs  containing  multiple  compilation  units, 
each  written  in  a  different  source  language  (C,  x86  assembly,  ML,  etc.),  each 


3Or  that  every  behavior  of  Py  is  a  possible  behavior  of  Ps,  if  Ps  is  nondeter- 
ministic  (refinement). 
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calling  functions  defined  either  in  the  other  compilation  units  or  by  external 
entities  such  as  an  operating  system.4 

I  address  these  issues  in  Chapters  3, 4, 5,  and  6.  Chapters  3  and  4  provide 
a  solution  to  the  basic  problem  of  how  to  specify  open  modules,  both  in 
isolation  and  in  interaction,  in  the  form  of  interaction  and  linking  semantics, 
respectively  Chapter  5,  on  structured  simulations,  shows  how  to  specify  the 
correctness  of  compiler  transformations  on  single  translation  units  in  a  mod¬ 
ular  way — without  reference  to  the  whole  program.  Chapter  6  proves  that 
structured  simulations  compose  both  vertically  (i.e.,  transitively)  and  hori¬ 
zontally.  By  horizontal  composition,  I  mean  that  the  structured  simulations 
induced  by  independently  compiling  the  individual  translation  units  of  a 
multimodule  program  compose  to  yield  correctness  of  the  whole-program 
transformation. 


1.2  Contributions  and  Thesis  Scope 

In  summary,  the  specific  contributions  of  this  dissertation  are: 

•  a  semantic  characterization  of  the  program  contexts  for  which  an 
optimizing  C  compiler  is  sound  (Chapter  6),  answering  the  question 
For  which  contexts  is  an  optimizing  C  compiler  sound? 

•  interaction  and  language-independent  linking  semantics  (Chapters  3 
and  4),  which  facilitate  the  statement  and  proof  of  cross-language 
program  equivalences  (Chapter  6),  answering  the  question  How  to 
achieve  language-independence? 

•  structured  simulations  (Chapter  5),  a  novel  extension  of  CompCert's 
forward  simulation  proof  method  that  composes  both  transitively  and 
horizontally,  across  program  modules,  answering  the  question  How  to 
reason  about  equivalence  of  open  modules? 

•  the  application  of  the  above  techniques  to  Compositional  CompCert 
(Chapter  8  and  [SBCA14]),  the  first  verified  separate  compiler  for  C, 
answering  the  question  Do  the  techniques  scale  to  languages  like  C,  and 
to  existing  verified  C  compilers  such  as  CompCert? 

To  substantiate  the  utility  of  my  approach,  I  show  (Chapter  7)  how  to  con¬ 
nect  the  Verifiable  C  program  logic  [ADH+14]  to  Compositional  CompCert. 

The  result  is  a  system  in  which  program  properties  can  be  be  proved  mod- 


4 A  distinct  but  equally  important  notion  is  vertical  (i.e.,  transitive)  composi- 
tionality  of  the  proofs  of  distinct  compiler  phases.  Vertical  compositionality  (cf. 
Section  6.1  of  Chapter  6)  is  required  to  prove  correctness  of  any  realistic  multi¬ 
phase  compiler. 
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ularly  at  the  source  level,  even  of  multilanguage  programs,  and  yet  are 
shown  to  be  preserved  by  separate  compilation. 

The  Coq  Proof  Development.  Except  where  otherwise  indicated  in  the 
text,  the  theorems  in  this  thesis  have  been  proved  and  machine-checked  in 
Coq.  The  development  is  divided  between  two  GitHub  repositories: 

The  Compositional  CompCert  repository  contains  the  bulk  of  the  devel¬ 
opment  and  proofs  (corresponding  to  Chapters  3-6).  It  is  available  at: 

https : / / github . com/PrincetonUniversity/ compcomp. 

The  Verified  Software  Toolchain  (VST)  repository  contains  the  proofs  that 
connect  the  Verifiable  C  logic  to  the  compiler  (Chapter  7): 

https :/ / github . com/PrincetonUniversity /VST. 

The  thesis  text  provides  pointers  into  the  developments,  where  appropriate. 


1.3  Relation  to  Previous  Work  by  the  Author  and 
Co-Authors 

Much  of  the  material  in  Chapters  2  through  6  is  based  on  previous  work 
by  myself  and  co-authors.  Interaction  semantics  and  logical  simulation 
relations  (the  precursor  to  the  structured  simulations  of  Chapter  5)  first 
appeared  in  ESOP'14  [BSDA14],  Lennart  Beringer  did  the  initial  work  on 
structured  simulations  and  adapted  many  of  CompCert's  compiler  phases. 
He  also  proved  that  structured  simulations  compose  transitively.  Language- 
independent  linking  and  structured  simulations  are  the  topic  of  a  paper  pre¬ 
sented  at  POPLT5  [SBCA15].  Juicy  memories  are  briefly  described  in  a  chap¬ 
ter,  which  I  co-authored,  of  Program  Logics  for  Certified  Compilers  [ADH+14], 
A  second  co-authored  chapter  of  the  same  book  gives  preliminary  advice 
on  "How  to  specify  a  compiler."  Much  of  the  material  in  Chapter  2  of 
this  dissertation  has  appeared  before,  some  of  it  verbatim,  in  Chapter  32 
of  [ADH+14],  of  which  I  am  a  co-author. 


1.4  Related  Work 

Compiler  verification  is  one  of  the  "big  problems"  of  computer  science,  as 
evidenced  by  the  large  body  of  research  it  has  spawned  in  the  45  years  or 
so  since  McCarthy  and  Painter  [MP67],  For  a  comprehensive  survey  up  to 
the  year  2003,  see  [Dav03].  Here  I  focus  on  the  most  closely  related  work. 
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1.4.1  Whole-Program  Compilation 

Moore  [Moo89]  was  one  of  the  first  to  mechanically  verify  a  programming 
language  implementation  (a  compiler  for  a  language  called  Piton).  The  most 
well-known  work  in  this  vein  since  Moore  is  Leroy's  CompCert  C  compiler 
in  Coq  [Ler09],  upon  which  Compositional  CompCert  is  based.  Chlipala 
has  also  built  verified  compilers  in  Coq — first,  from  lambda  calculus  to 
idealized  assembly  language  [Chl07],  and  later  for  an  impure  functional 
language  [ChllO].  But  both  Chlipala  and  Leroy's  compilers  were  limited 
to  whole  programs — they  did  not  provide  correctness  guarantees,  as  I  do 
in  this  work,  about  the  behavior  of  separately  compiled  multimodule  pro¬ 
grams.  More  recently,  Dockins  [Docl2]  completed  an  in-depth  study  of 
notions  of  operational  refinement  for  whole  programs,  with  applications  to 
CompCert  and  compiler  correctness  more  generally. 

1.4.2  Compositional  Compilation  and  Logical  Relations 

Benton  and  Hur  were  two  of  the  first  explicitly  to  do  compositional  spec¬ 
ification  of  compilers  and  low-level  code  fragments,  first  for  a  compiler 
from  a  simply  typed  functional  language  to  a  variant  of  Landin's  SECD 
machine  [BH09],  then  for  a  functional  language  with  polymorphism  [BH10]. 
This  initial  work  was  followed  by  a  string  of  papers — by  Dreyer,  Hur,  and 
collaborators — that  resulted  in  refinements  of  the  basic  techniques  (step- 
indexed  logical  relations  and  biorthogonality).  The  refinements  included 
extensions  to  step-indexed  Kripke  logical  relations,  for  dealing  with  state  in 
the  context  of  more  realistic  ML-like  languages  [HD11],  and  more  recently, 
to  relation  transition  systems  (RTSs)  [HDNV12]  and  the  related  parametric 
bisimulations  [HNDV13].  RTSs  demonstrated  that  it  was  possible  to  do 
bisimulation-style  reasoning  in  the  possible-worlds  style  of  Kripke  logi¬ 
cal  relations  and  state  transition  systems;  parametric  bisimulations  refined 
RTSs  by  removing  some  technical  restrictions.  Both  parametric  bisimula¬ 
tions  and  RTSs  compose  transitively,  like  our  structured  simulations  but 
unlike  Kripke  logical  relations. 

Although  they  focus  on  typed  higher-order  functional  languages  with 
only  limited  forms  of  shared  memory  (mutable  references),  some  of  the 
techniques  used  by  Benton,  Dreyer,  Hur,  and  their  collaborators  draw  inter¬ 
esting  parallels  in  our  own  work.  Our  "us  vs.  them"  protocol  (Chapter  5)  is 
at  least  superficially  similar  to  the  "local  vs.  global  knowledge"  distinction 
that's  made  in  RTSs.  One  difference  is,  we  distinguish  between  local  and  ex¬ 
ternal  invariants  on  the  state  shared  by  modules,  whereas  in  RTSs  the  local 
vs.  global  distinction  is  really  about  different  notions  of  term  equivalence. 
Also,  our  "them"  invariants — which  encapsulate  one  structured  Simula- 
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tion's  view  of  the  memory  regions  allocated  by  external  functions — are  not 
quite  "global"  in  the  same  sense  as  Hur  et  al.'s  global  knowledge.  Perhaps 
more  fruitfully,  one  can  view  interaction  semantics — and  the  structured 
simulations  that  are  "indexed  to"  interaction  semantics — as  an  analogue  of 
the  type  structure  used  to  index  standard  logical  relations,  but  here  applied 
to  imperative  languages  with  impoverished  type  systems:  C,  x86,  and  the 
other  languages  of  CompCert.  As  in  Kripke  logical  relations,  structured 
simulations  use  Kripke-style  possible  worlds  to  model  memory  allocation. 

An  alternative  to  language-independent  interaction  semantics  is  multi¬ 
language  semantics  [AB11],  which  combines  several  languages  of  a  compiler 
into  a  single  host  language  via  syntactic  boundary  casts  in  the  style  of 
Matthews  and  Findler  [MF07],  This  makes  it  possible  to  state  the  correct¬ 
ness  of  a  separate  compiler  as  contextual  equivalence  in  the  combined 
language,  as  Perconti  and  Ahmed  have  recently  done  for  a  two-phase  com¬ 
piler  from  System  F  with  existential  and  recursive  types  [PA14],  But  where 
Perconti  and  Ahmed  define  contexts  syntactically,  as  one-hole  terms  in  the 
combined  language,  we  define  contexts  semantically,  as  interaction  seman¬ 
tics.  McKay's  variation  of  Perconti  and  Ahmed's  approach  replaces  explicit 
boundary  conversion  with  programmatic  conversion  expressed  as  terms  of 
the  combined  language,  but  considers  only  a  single  transformation,  closure 
conversion  [McK14], 

1.4.3  Verifying  and  Compiling  Concurrency 

Liang  et  al.'s  work  [LFF12]  on  verifying  concurrent  program  transforma¬ 
tions  inspired  my  use  of  a  rely-guarantee  discipline,  but  the  complex¬ 
ity  of  stack  frame  management,  spilling,  and  block  coalescing  in  Compo¬ 
sitional  CompCert  made  it  difficult  to  apply  their  ideas  directly  in  our 
setting.  Ley-Wild  and  Nanevski's  subjective  concurrent  separation  logic 
(SCSL)  [LWN13]  used  subjective  rely-guarantee  invariants  on  auxiliary  state 
to  verify  coarse-grained  concurrent  programs,  such  as  parallel  increment. 
Later  work  by  Nanevski  et  al.  extended  the  techniques  to  support  verifi¬ 
cation  of  fine-grained  concurrent  programs  [NLWSD14],  These  subjective 
invariants  made  their  proofs  robust  to  the  thread  structure  of  the  environ¬ 
ment.  Our  "us  vs.  them"  invariants  serve  a  similar  purpose — to  prevent 
module-local  structured  simulations  from  being  sensitive  to  the  exact  com¬ 
position  of  their  environment  (other  modules). 

Also  related  is  verified  compilation  of  concurrent  programs.  Lochbihler 
verified  a  whole-program  compiler  for  multithreaded  Java  [Locl2],  Sevcik 
et  al.  built  CompCertTSO  [SVN+13],  which  adapted  CompCert's  correct¬ 
ness  proofs  to  the  x86  TSO  weak  memory  model,  in  order  to  reason  about 
compilation  of  racy  C  code.  Mansky's  PTRANS  framework  [Manl4]  models 
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optimizations  as  rewrite  operations  on  parallel  control  flow  graphs,  spec¬ 
ified  using  temporal  logic  formulae.  While  all  three  of  these  projects  are 
whole-program,  there  are  some  similarities  with  my  work.  For  example, 
both  CompCertTSO  and  PTRANS  lift  program  refinements  from  individ¬ 
ual  threads  to  whole  programs,  as  I  do  for  interacting  modules,  under 
certain  noninterference  conditions  on  shared  state.  One  difference  is  that 
PTRANS  and  CompCertTSO  both  state  the  noninterference  conditions  in 
a  "large  footprint"  way,  as  global  whole-system  invariants.  My  horizontal 
composition  results  instead  rely  only  on  a  module-local  characterization  of 
noninterference,  in  the  form  of  reach-closed  semantics.  That  said,  it  would 
be  interesting  future  work  to  investigate  whether  the  compositional  com¬ 
pilation  approach  I  advocate  in  this  thesis  could  be  applied  to  verified 
compilation  with  weak  memory  models. 

1.4.4  Game  Semantics  for  Interaction 

The  system-level  semantics  of  Ghica  and  Tzevelekos  [GT12]  extended  stan¬ 
dard  game  models  of  programs  to  the  more  general  situation  in  which  the 
moves  of  the  opponent  (environment)  are  constrained  by  semantic  rather 
than  combinatorial  or  syntactic  restrictions.  The  key  semantic  constraint, 
which  Ghica  and  Tzevelekos  call  "epistemic"  (the  environment  may  only 
update  memory  locations  it  learned  about  at  interaction  points),  is  similar 
in  some  ways  to  the  "reach-closed"  restrictions  I  impose  on  source  modules 
in  Theorem  5.  In  this  dissertation,  the  restriction  to  reach-closed  interaction 
semantics  was  a  natural  side  condition  of  the  proof:  The  kinds  of  trans¬ 
formations  present  in  Compositional  CompCert  are  just  not  sound  in  the 
context  of  "omniscient"  program  contexts  that  may  write  to  arbitrary  mem¬ 
ory  locations,  even  those — like  return  addresses  and  spills  in  function  stack 
frames — that  are  managed  by  the  compiler.  One  major  difference  to  the 
work  of  Ghica  and  Tzevelekos  is  that  I  apply  the  techniques  to  the  two- 
program  setting  of  compositional  compilation;  Ghica  and  Tzevelekos  were 
concerned  primarily  with  modeling  the  interactions  of  a  single  program 
module  with  its  environment. 

1.4.5  The  Bleeding  Edge 

A  number  of  research  groups  have  recently  begun  working  on  compo¬ 
sitional  compilation  for  realistic  languages.  Tahina  Ramanandro,  along 
with  colleagues  at  Yale,  has  proposed  a  new  separate  compilation  frame¬ 
work  [RSW+15]  for  CompCert  that  compares  favorably  in  many  respects 
to  the  approach  I  describe  in  this  thesis.  For  example,  in  Ramanandro's 
approach,  linking  is  defined  semantically  on  module  behaviors,  in  mixed 
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big-step /small-step  style — which  parallels  my  small-step  semantic  linking 
operator  C  (Chapter  4).  This  approach  leads  to  an  elegant  specification  (and 
proof)  of  soundness  of  syntactic  linking  with  respect  to  the  semantic  linking 
semantics,  when  modules  are  all  implemented  in  the  same  language.  I  do 
not  yet  have  such  a  proof  (though  see  Chapter  9  for  further  discussion). 
One  apparent  disadvantage  of  Ramanandro's  approach  is  that  it  requires  a 
stronger  notion  of  memory  transformation  (essentially,  bijection)  between 
source  and  target  memories  over  each  compiler  transformation.  To  construct 
such  a  bijection  for  CompCert's  Cminorgen  phase,  it  was  necessary  to  add 
"tags"  to  memory  regions,  and  to  update  the  semantics  of  the  CompCert 
languages  (e.g.  Csharpminor  and  Cminor).  Bijective  relations  are  not  neces¬ 
sary  in  Compositional  CompCert.  Also,  my  colleagues  and  I  have  verified 
a  complete  compiler  (cf.  Chapter  8);  Ramanandro  et  al.  have  so  far  verified 
only  a  few  of  CompCert's  (admittedly  more  difficult)  phases. 

Very  recently,  Chung-Kil  Hur  and  Jeehon  Kang  completed  a  proof  of 
separate  compilation  for  the  most  recent  version  of  CompCert  (at  the  time 
of  writing,  version  2.4)  [Com],  The  theorem  proved  is  more  limited  than 
that  of  this  thesis;  in  particular,  Hur  and  Kang's  proof  does  not  say  any¬ 
thing  about  linking  with  modules  that  were  not  compiled  by  CompCert 
from  source  modules  in  CompCert  C.  At  the  same  time,  the  Hur  and  Kang 
proof  is  an  elegant  piece  of  engineering:  it  manages  to  factor  the  proof  to 
use  many  of  CompCert's  forward  simulations  intact.  In  private  communi¬ 
cation,  Hur  has  mentioned  that  next  steps  will  include  the  application  of 
recent  (unpublished)  work,  by  Hur  and  others,  on  parametric  inter-language 
simulations  [PIL]  to  CompCert,  in  order  to  extend  their  proof  to  support 
linking  with  more  general  program  contexts. 

Wang,  Cuellar,  and  Chlipala,  in  recent  work  at  OOPSLA  [WCC14],  showed 
how  to  connect  verified  multilanguage  programs  to  a  verified  compiler  for  a 
small  C-like  language  (Cito).  Their  approach  builds  the  axiomatic  specifica¬ 
tions  of  external  functions,  as  Hoare-style  pre/ post-conditions  on  abstract 
data  types,  into  the  operational  semantics  of  their  source  language.  Com¬ 
positional  CompCert  avoids  tying  axiomatic  specifications,  and  thus  the 
details  of  the  program  logic,  to  compiler  correctness. 


Chapter 


The  CompCert  Memory 
Model 


In  the  semantics  of  imperative  languages,  a  memory  model  defines  the  mean¬ 
ing  of  the  memory-manipulating  operations  supported  by  the  language, 
such  as  memory  load  and  store.  For  toy  imperative  languages,  this  model 
may  be  as  simple  as  a  partial  map  from  locations  to  values. 

mem  =  loc  — ^  val 

In  a  language  like  C,  the  memory  model  is  significantly  more  complicated. 
It  must  specify,  among  other  things, 

•  the  C  object  model,  in  order  to  define  which  pointer  arithmetic  and 
comparison  operations  are  valid; 

•  the  byte-  vs.  word-level  representation  of  data,  to  handle  standard  li¬ 
brary  functions  such  as  memcpy  [ISOll,  7.24.2.1]  and  their  interaction 
with  normal  loads  and  stores; 

•  data  alignment,  e.g.,  to  word  boundaries; 

•  and — for  Pthreads-style  shared-memory  concurrency  and  certain  con¬ 
stant  propagation  optimizations — permissions  on  memory  values  that 
restrict  or  enable  access  to  parts  of  the  memory  state. 

In  this  chapter,  I  give  background  on  CompCert's  memory  model  [LABS14] 
and  describe  how  it  handles  the  above.  The  first  section  introduces  Comp- 
Cert  memory-model  basics,  such  as  the  addressing  model  and  value  types. 
These  aspects  are  mostly  unchanged  even  from  the  earliest  versions  of 
CompCert  (up  to  version  1.10,  which  I  collectively  call  version  1  [LB08]).  In 
Section  2.2, 1  describe  several  enhancements  to  the  memory  model  which  I 
contributed  to,  including  the  addition  of  memory  permissions.  The  original 
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Figure  2.1:  A  CompCert  memory  (version  1).  The  hatched  region  (block  2) 
has  been  deallocated.  Address  (b,z)  =  (4,2)  contains  an  abstract  "byte"  (in 
gray).  Block  6  is  nextblock,  the  unreserved  region  to  be  returned  by  the  next 
allocation  operation. 


motivation  for  permissions  was  shared-memory  concurrency;  however,  the 
permissions  turned  out  to  be  useful  in  proofs  of  separate  compilation  as 
well  (Chapters  5  and  6).  Permissions  are  also  used,  in  standard  CompCert, 
to  reason  about  optimization  of  read-only  global  variables. 


2.1  Memory  Model  Basics 

The  CompCert  memory  model  is  block-  (or  region-)  structured.  Addresses 
(b,z)  are  pairs  of  a  block/region  identifier  b  (a  positive  number)  and  an 
integer  offset  2.  In  CompCert's  high-level  C-like  languages  ( e.g .,  CompCert 
C  and  Clight),  blocks  are  allocated  one  per  global  variable,  one  per  call  to 
malloc,  and — per  function  invocation — one  per  addressed  local  variable. 
Pointer  arithmetic 

(b,z)  +  n  =  (b,  z  +  n) 

is  then  allowed  only  within,  not  between  blocks.  This  regime — modeling 
globals,  addressed  locals,  and  mal  loc'd  regions  as  distinct  blocks — prevents 
pointer  arithmetic  across,  e.g.,  distinct  addressed  locals  or  distinct  globals, 
which  is  undefined  in  C. 
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Each  address  ( b ,  z )  specifies  an  abstract  "byte",  at  offset  2  in  block  b.  Off¬ 
sets  may  be  either  positive  or  negative.  In  the  first  version  of  the  CompCert 
memory  model,  blocks  were  bounded  above  and  below  by  two  functions: 
low  abound  m  b,  which  gives  the  low  bound  of  block  b,  and  high_bound  m  b, 
which  gives  the  upper  bound  of  the  block.  Loads  and  stores  succeeded  only 
to  offsets  below  the  high  bound  and  above  the  low  bound. 

Figure  2.1  depicts  this  situation  for  a  representative  CompCert  memory. 
There  are  5  allocated  blocks,  numbered  1  —  5.  Block  2  has  been  allocated 
but  then  freed  (indicated  by  black-white  hatching).  The  CompCert  memory 
allocation  model  assumes  an  infinite  number  of  memory  regions;  it  will 
never  reuse  block  2.  The  gray  box  is  an  (abstract)  bytes  at  address  (4,2), 
block  4  at  offset  2. 

Block  6  is  the  next  free  memory  region,  called  nextblock  in  CompCert. 
At  the  next  allocation  operation 

alloc  :  mem  — >  Z  — *  Z  — »  mem  x  block 
alloc  m  lo  hi  =  (nV,  nextblock  m) 

the  CompCert  memory  model  returns  (m' ,  nextblock  m),  where  nextblock  m 
is  the  block  number  of  the  newly  allocated  region  (=  6  in  Figure  2.1  above) 
and  m!  is  the  updated  memory  state  that  records  the  allocation  (e.g.,  by 
incrementing  nextblock).  The  low-high  boundaries  of  newly  allocated  mem¬ 
ory  regions  are  given — at  block  allocation  time — by  the  integers  lo,  hi. 

2.1.1  Values,  Loads,  and  Stores 

Values.  CompCert's  intermediate  languages  share  a  common  value  type 
val,  defined  by  the  following  inductive  data  type: 

Inductive  val  :  Type  = 

|  Vundef  :  val 
|  Vint  :  int  — »  val 
|  Vlong  :  int64  — >  val 
|  Vfloat  :  float  — >  val 
|  Vptr  :  block  — »  int  — >  val. 

Vundef  is  the  undefined  value,  associated  to  uninitialized  local  variables. 
Vint  i  is  an  integer  value,  with  i  a  32-bit  machine  integer.  Vlongs  represent 
64-bit  machine  integers.  Vfloats  are  floating-point  numbers.  Vptr  b  i  is  a 
pointer  value,  addressing  block  b  and  (machine-integer)  offset  i.  To  convert 
from  a  Vptr  to  an  actual  CompCert  memory  location,  one  must  convert  i 
from  type  int  to  type  Z,  a  lossless  operation.  (Converting  from  Z  to  int  can 
overflow,  however.) 
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Loads.  Memory  loads  in  CompCert  have  the  following  shape: 

load  :  mem  — >  memory  .chunk  — >  block  — >■  Z  — >  option  val 
load  m  ch  b  z  =  Some  v 

Loading  from  memory  m  at  address  ( b ,  z),  with  chunk  type  ch,  gives  value 
v.  The  chunk  types  ch  :  memory  .chunk  specify  the  size,  type,  and  alignment 
constraints  that  apply  to  a  given  load  or  store  operation. 

Inductive  memory.chunk  :  Type  = 

|  Mint8signed  (*  8  — bit  signed  integer  *) 

|  Mint8unsigned  (*  8  — bit  unsigned  integer  *) 

|  Mintl6signed  (*  16  — bit  signed  integer  *) 

|  Mintl6unsigned  (*  16  — bit  unsigned  integer  *) 

|  Mint32  (*  32  — bit  integer,  or  pointer  *) 

|  Mint64  (*  64  — bit  integer  *) 
j  Mfloat32  (*  32— bit  single— precision  float  *) 

|  Mfloat64  (*  64— bit  double— precision  float  *) 

|  Mfloat64al32.  (*  64— bit  double— precision  float,  4— aligned  *) 

For  example,  a  load  in  C  from  the  address  of  a  32-bit  single-precision  float 
variable,  as  in  the  program  fragment: 

float  f;  *&f; 

is  modeled  as  load  with  chunk  type  Mfloat32.  Each  chunk  type  has  a  natural 
size  \ch\  in  bytes  and  a  natural  alignment  (ch).  For  example,  |Mfloat64al32| 
equals  8  while  (Mfloat64al32)  equals  4  (4-aligned  64-bit  double-precision 
float). 

A  memory  load  will  fail  with  None  (return  type  option  val)  when  it  is 
either  (i)  to  a  misaligned  address  (for  the  given  chunk  type)  or  (ii)  because 
the  load  violates  the  bounds  of  the  addressed  block.  The  rules,  for  a  load 
(b,  z)  at  chunk  ch,  are: 

•  Alignment:  (ch)  divides  z) 

•  Bounds-Checking:  low.bound  m  b  <  z  A  z  +  \ch\  <  high.bound  m  b. 


Stores.  Storing  value  v  in  memory  m  at  address  ( b,z ),  with  chunk  type 
ch,  gives  new  memory  m' . 

store  :  mem  — >  memory  .chunk  — >  block  — >  Z  — >  val  — >  option  mem 
store  m  ch  b  z  v  =  Some  m' 
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Stores  are  partial,  like  loads.  Stores  also  respect  the  same  alignment  and 
bounds-checking  rules  as  memory  loads  (stores  must  be  aligned,  with  re¬ 
spect  to  the  alignment  of  the  memory  chunk,  and  must  be  within  bounds). 


The  interactions  of  loads  and  stores  (as  well  as  loads  and  allocations  and 
loads  and  frees)  are  governed  by  a  number  of  rules,  described  in  detail  in 
Chapter  44  of  Appel  et  al.'s  new  book  [ADH+14],  I  briefly  summarize  them 
here. 

The  common  case  is  load-after-store,  in  which  we  load  from  the  same 
address  (same  block  and  offset)  that  we  previously  stored. 

store  m  ch  b  z  v  =  Some  m 
load  ml  ch'  b  z  =  Some  (convert  ch'  v ),  if  \ch\  =  \ch'\ 

The  C  standard  [ISOll,  6.5]  states  that  load-after-store  should  succeed  only 
if  chunk  ch'  (representing  a  C  type)  is  compatible  with  ch,  the  chunk  type  at 
which  the  address  was  last  assigned.  Compatibility  means  that  two  types 
differ  only  in  qualifiers  and  signedness.  CompCert  models  compatibility 
by: 

•  \ch\  =  \ch'\; 

•  implicitly  casting  v  to  type  ch  at  store-time; 

•  converting  v  to  type  ch'  at  loads. 

Think  of  convert  as  a  C  cast.  For  example,  if  we  attempt  to  load  an  integer 
Vint  i  at  chunk  type  Mint8signed,  convert  will  return  the  8-bit  sign  extension 
of  i. 

The  other  load-after-store  cases  are  disjoint  load-stores,  loads  that  "over¬ 
lap",  and  loads  which  use  incompatible  types.  These  cases  are  summarized 
in  the  following  figure  from  Chapter  32  of  [ADH+14]. 

Store - - 

Compatible  load -  - E - E - 

Incompatible  load - ^  [ - j - 

Disjoint  loads - [  [ - 1 - j[  [ — 

Overlapping  loads - [  t  [ - [  i  [ — 

In  the  "incompatible"  and  "overlapping"  cases,  we  attempt  to  load  with  an 
incompatible  chunk  type  ( e.g .,  of  the  wrong  size)  or  from  a  memory  region 
that  only  partially  overlaps  (byte-for-byte)  the  stored  region.  Since  the  first 
version  of  the  CompCert  memory  model  did  not  expose  the  underlying 
byte  representation  of  values,  such  "bad"  loads  just  returned  None. 
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2.2  Memory  Model,  Version  2 

Version  2  of  the  CompCert  memory  model  improved  upon  version  1  in  two 
major  ways.  The  first  was  to  expose  the  byte-level  representation  of  values  in 
memory,  in  order  to  give  semantics  to  operations  like  memcpy  and  the  over¬ 
lapping  loads  I  described  in  the  previous  section.  The  second  innovation 
was  to  add  permissions  to  the  memory  model,  which  replaced  the  low  .bound 
and  high.bound  functions  of  version  1  and  prepared  the  CompCert  compiler 
and  its  memory  model  for  connection  to  a  concurrent  separation  logic  (as 
described  briefly  in  Chapter  44  of  [ADH+14]).  As  it  turned  out,  however, 
the  permissions  added  in  version  2  were  also  convenient  for  stating  the  sep¬ 
arate  compilation  invariants  of  Compositional  CompCert.  In  this  section,  I 
briefly  describe  the  version  2  memory  model's  support  for  byte-level  data 
representations.  Then  I  introduce  the  permission  model. 

Byte-Level  Data  Representation.  Version  1  of  the  CompCert  memory 
model  gave  accurate  semantics  to  most  memory  loads  and  stores  (and 
allocations  and  frees)  but  hid  the  underlying  byte-level  representation  of 
values.  In  version  2,  each  location  is  mapped  to  a  memval,  an  abstraction 
of  a  single  byte  of  data.  Memory  loads  (and  stores)  then  decode  (encode) 
sequences  of  memvals  as  values. 

The  memvals  themselves  are  defined  inductively: 

Inductive  memval  :  Type  = 

|  Undef  :  memval 

|  Byte  :  byte  — >  memval 

|  Pointer  :  block  — »  int  — »  nat  — >  memval. 

Undef  memvals  represent  undefined  bytes,  as  might  be  associated  with 
the  values  of  stack-allocated  uninitialized  local  variables  (four  Undefs 
compose  a  single  Vundef). 

Bytes  are  8-bit  machine  integers  in  the  range  0. . .  255,  which  compose  larger 
values  such  as  integers,  floats,  or  longs,  taking  aspects  like  endianness 
and  the  IEEE  encoding  of  floats  into  account. 

Pointer  memvals  are  abstractions  of  the  bytes  that  compose  pointer  values. 
Pointer  bin  is  the  n-th  chunk  of  a  pointer  value  Vptr  b  i.  The  intent 
is  to  encode  pointers  in  a  way  that  does  not  expose  their  underly¬ 
ing  representation  (e.g.,  as  bytes,  which  would  be  too  concrete),  but 
still  supports  operations  like  memcpy.  In  the  most  recent  versions 
of  CompCert  (>  2.4),  Pointers  are  generalized  to  memory  Fragments, 
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which  allow  uninterpreted  encodings  of  data  values  other  than  just 
pointers. 

Along  with  the  memval  interpretation  of  bytes  in  memory,  CompCert  defines 
operations  that  encode/decode  values  to/from  sequences  of  memvals.  More 
details  are  available  in  [ADH+14], 

Permissions.  The  second  innovation  in  version  2  was  to  add  permissions 
to  the  model,  associated  with  each  byte  in  memory.1  The  permissions  are: 

Freeable  top  permission:  can  compare,  read,  write,  and  free 
Writable  can  compare,  read,  and  write  but  not  free 
Readable  can  compare  and  read  but  not  write  or  free 
Nonempty  can  only  compare 

"Compare"  is  the  ability  to  compare  a  pointer  to  the  given  location  with 
other  pointers.  We  say  a  pointer  Vptr(&,  z)  is  valid  for  pointer  comparison 
if  the  location  (b,z)  has  at  least  Nonempty  permission. 

Permissions  accumulate:  having  permission  p  implies  having  all  permis¬ 
sions  p'  <  p,  where  the  permission  order  is  defined 

Nonempty  <  Readable  <  Writable  <  Freeable 

A  memory  location  may  have  no  permission  at  all.  In  this  case,  we  say  that 
the  location  is  empty.  This  is  typically  the  case  for  locations  that  have  not 
yet  been  allocated,  or  which  have  already  been  freed. 

Every  byte  location  is  associated  not  to  one,  but  to  two  permissions:  the 
current  permission  and  the  max  permission.  Throughout  an  execution,  the 
current  permission  is  always  less  than  or  equal  to  the  max  permission.  The 
max  permission  evolves  predictably  over  a  location's  lifetime:  when  the 
location  is  allocated,  it  has  max  permission  Freeable;  this  permission  can 
later  be  lowered  by  a  drop_perm  operation;2  finally,  freeing  the  location 
removes  all  its  max  permissions,  making  the  location  empty.  The  max  per¬ 
mission  can  only  decrease  once  the  location  has  been  allocated.  In  contrast, 
the  current  permission  can  decrease  or  increase  (without  ever  exceeding 


1The  material  in  this  subsection  has  significant  overlap,  some  of  it  verbatim, 
with  [ADH+14,  Chapter  32],  of  which  I  am  a  co-author.  See  that  work  for  further 
details. 

2drop_perm  was  added  in  version  2  and  is  used  to  model,  e.g.,  the  lowering 
of  a  constant  global's  permission  from  Freeable  to  Readable. 
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the  max  permission)  during  the  lifetime  of  the  location.  For  example,  in  a 
(proposed)  extension  to  shared  memory  concurrency,  an  unlock  operation 
temporarily  drops  current  permissions,  which  can  be  recovered  by  a  subse¬ 
quent  lock  operation.  The  following  figure  (also  from  [ADH+14,  Chapter 
32])  illustrates: 
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alloc  drop  unlock  read-lock  unlock  write-lock 
The  association  of  permissions  to  locations  is  defined  by  the  predicate: 


perm  :  mem  — *  block  -4  Z  -4  perrrukind  — »  permission  — »  Prop 


where  perrmkind  is  the  inductive  Max  |  Cur.  The  proposition  perm  m  b  z  k  p 
means  memory  state  m  at  location  (b,  z)  has  ^-permission  at  least  p.  The 
cumulativity  of  permissions,  and  the  fact  that  current  permissions  are  never 
above  max  permissions,  are  expressed  by  the  following  implications: 

perm  mbzkp/\p'<p  = =>■  perm  m  b  z  k  p' 

perm  m  b  z  Cur  p  = =4  perm  m  b  z  Max  p 


In  version  1  of  the  memory  model,  load  and  store  operations  checked  that 
offsets  were  within  bounds.  In  version  2,  load  and  store  instead  check  that 
the  accessed  locations  have  current  permissions  at  least  Readable  (Writable 
for  store).  Likewise,  free  checks  that  the  affected  locations  have  current  per¬ 
missions  at  least  Freeable.  Defining 

range_perm  (m  :  mem)  ( b  :  block)  ( lo  hi  :  Z) 

[k  :  perm.kind)  (p  :  permission)  :  Prop  = 

\/z.  lo  <  z  <  hi  ==?  perm  m  b  z  k  p. 

as  the  predicate  that  is  true  iff  all  offsets  between  lo  and  hi  have  permission 
at  least  p,  we  get  the  following  access  conditions  for  the  various  memory 
operations: 


Operation. . . 

succeeds  if  and  only  if. . . 

load  m  ch  b  z 

range.perm  m  b  z  (z  +  \ch  )  Cur  Readable 

store  m  ch  b  z  v 

range.perm  m  b  z  (z  +  \ch  )  Cur  Writable 

free  m  b  l  h 

range_perm  m  b  l  h  Cur  Freeable 

drop_perm  m  b  l  h  p 

range_perm  m  b  l  h  Cur  Freeable 
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In  general,  permissions  are  preserved  by  operations  over  memory  states, 
with  the  following  exceptions: 

•  alloc  m  l  h  =  ( m',b )  sets  Max  and  Cur  to  Freeable  in  range  (b,l)  to 
(b,h-  1). 

•  free  m  b  l  h  drops  all  permissions  to  empty  in  ( b,l )  to  (b,h  —  1). 

•  drop_perm  m  b  l  h  p  sets  Max  and  Cur  in  range  (l,h  —  1)  to  p. 


2.3  Memory  Transformations 

To  be  useful  in  compiler  proofs,  a  C  memory  model  must  also  support  the 
memory  transformations  performed  by  an  optimizing  compiler:  spilling 
and  reloading,  dead-code  elimination,  function  inlining,  etc.  Operations 
like  load  and  store  should  be  insensitive  to  the  memory  transformations  per¬ 
formed  by  these  optimizations.  Otherwise,  the  optimizations  may  change 
the  observable  behavior  of  the  program  (by  causing,  e.g.,  a  store  that  suc¬ 
ceeded  before  compilation  to  fail  afterward). 

Consider  CompCert's  SimpILocals  phase,  which  pulls  unaddressed  local 
variables — such  as  a  in  the  following  C  program — out  of  memory  and  into 
a  temporaries  (register)  environment. 


int  f (void)  { 

int  a=l;  float  b; 
g (&b) ; 
return  a; 


Before 

SimpILocals 


In  CompCert's  highest-level  languages,  invoking  f  will  generate  a  memory 
state  that  contains  two  blocks  for  f 's  locals,  one  for  a  (call  it  block  1)  and  an¬ 
other  for  b  (call  it  block  2).  SimpILocals  will  detect  that  a  is  never  addressed, 
however,  which  means  it  can  promote  a  to  a  register. 


b 


After  tm 


Block  1 


Before  m 
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The  memory  before  the  transformation  is  m,  the  memory  after  trn.  After 
Simpl Locals,  f  allocates  just  one  block  (for  b)  instead  of  two  blocks.  The 
identifiers  assigned  to  blocks  have  also  changed:  b's  block  (formerly  block 
2)  is  renamed  block  1. 


To  model  transformations  of  this  form,  CompCert  uses  a  generalization 
of  block  renaming  called  memory  injection: 

f  :  block  — »  option  (block  X  Z) 
f  b  =  Some  (b',5) 

A  memory  injection  /  maps  a  subset  of  the  blocks  b  £  dom(m)  to  new 
blocks  l)  £  dom (tm),  at  offset  5.  For  example,  the  SimpILocals  transformation 
described  above  is  modeled  by  the  memory  injection 

/  (Block  1)  =  None 
/  (Block  2)  =  Some  (Block  2, 0) 

that  maps  block  1  to  None  and  block  2  to  block  1,  at  offset  0. 


We  extend  the  relation  /  to  memvals  as  follows: 

memvaLinject  /  Undef  mv 
memvaLinject  /  (Byte  n)  (Byte  n) 
memvaLinject  /  (Pointer  b  z  n)  (Pointer  b'  z '  n) 
iff  f  b  =  Some  (b' ,5)  A  z'  =  z  +  5 


The  relation 


vaLinject  f  v  v' 


defines  the  analogous  lifting  to  vals  (pointer  values  are  injected;  Vundef 
values  are  refined  to  arbitrary  values;  otherwise,  e.g.,  on  integers,  vaLinject 
is  the  identity  relation).  We  use  notation  valsjnject  f  v  v'  to  denote  the 
pairwise  application  of  vaLinject  /  to  the  sequences  of  values  v  and  v'. 


Memory  injections  support  more  complicated  memory  transformations 
as  well.  For  example,  CompCert's  Cminorgen  phase  coalesces  the  multiple 
blocks  allocated  at  a  given  function  invocation  into  a  single  "Cminor  "  stack 
block,  to  facilitate  the  final  layout  of  stack  frames  in  memory  in  the  Stacking 
phase.  This  transformation  has  the  following  general  form: 
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The  two  blocks  on  the  left  are  mapped  into  a  single  larger  block  on  the  right, 
with  the  upper-left  block  being  transposed  at  a  nonzero  positive  offset  into 
the  block  on  the  right. 

Passes  like  CompCert's  spilling  phase  may  also  extend  memory  blocks, 
in  order  to  accommodate,  e.g.,  registers  that  have  been  spilling  into  stack 
frames.  These  transformations  have  form: 


in  which  the  block  on  the  left  has  been  injected  into  the  block  at  the  right 
(at  a  zero  or  nonzero  offset).  The  block  on  the  right  contains  fresh  locations 
(hatched  regions)  not  present  in  the  left  block. 

Some  memory  transformations  are  not  expressible  as  memory  injections. 
Consider  the  following  two  diagrams  in  which  (a)  a  block  is  related  simul¬ 
taneously  to  two  blocks;  and  (b)  two  blocks  are  simultaneously  mapped  to 
overlapping  offsets  of  a  single  target  region. 


Memory  injections  are  functions,  ruling  out  (a).  CompCert  memory  injec¬ 
tions  prohibit  overlap  as  in  (b).  Neither  (a)  nor  (b)  is  needed  in  CompCert. 
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reach  :  mem  — »  set  block  — »  list  block  — >  set  block 

reach  m,  R  nil  =  R 

reach  m  R  (( b',z ')  ::  L)  = 

{b  |  b'  e  reach  m  R  L 

A  perm  m  b'  z '  =  Readable 
A  3z.  m(b',z')  =  Vptr  (b,z)  } 

REACH  m  R  =  {b  \  3L.  b  £  reach  m  R  L } 

Figure  2.2:  Reachability 


However,  it's  possible  that  a  weakening  of  memory  injections  to  support  (a) 
and  (b)  could  be  useful  for  future  compiler  optimizations. 

In  general,  we  say  a  memory  injection  /  injects  memory  m  to  tin,  written 

inject  f  m  tm 

if  (1)  permissions  at  locations  in  m  are  preserved  in  tm  under  the  transfor¬ 
mation  /;  and  (2)  each  readable  byte  value  mapped  by  /  in  m  is  related  by 
memvaLinject  to  the  corresponding  byte  in  tm. 

Memory  injections — which  are  key  for  proving  correctness  of  optimiza¬ 
tions  that  reorganize  memory  layout — behave  the  same  in  versions  1  and 
2  of  the  CompCert  memory  model.  However,  memory  injections  are  not 
quite  expressive  enough  to  support  full  separate  compilation.  Among  other 
things,  they  do  not  cleanly  distinguish  "private"  memory  regions  (the  com¬ 
piler  has  freedom  to  optimize  these  regions  more  aggressively)  from  public 
regions  to  which  pointers  may  have  been  leaked.  Structured  simulations 
(Chapter  5)  enrich  memory  injections  with  additional  structure  to  support 
such  fine-grained  invariants. 


2.4  Validity  and  Reachability 

In  addition  to  structured  injections  and  simulations,  which  we'll  first  present 
in  Chapter  5,  we  require  a  few  additional  memory-model-related  defini¬ 
tions,  some  of  which  are  not  present  in  standard  CompCert. 

We  say  a  memory  region  b  is  reachable  in  memory  m  from  a  set  of  root 
blocks  R  (REACH  m  R  b,  Figure  2.2)  when  there  is  a  path  L  of  readable  point¬ 
ers  starting  from  a  root  region  in  R  and  ending  at  b.  A  pointer  Vptr  ( b,  z) 
is  readable  when  location  (b,z)  has  at  least  Readable  permission. 

Reachability  will  play  an  important  role  in  Chapters  5  and  6.  We  calcu¬ 
late  reachability  on  memory  regions  b  rather  than  locations  (b,z)  because 
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pointer  arithmetic  is  always  allowed  within  (readable)  regions.  Hence  an 
entire  region  is  reachable  once  any  location  within  it  is  reachable. 

The  following  definitions  (already  present  in  standard  CompCert)  will 
also  prove  useful  in  later  chapters.  We  say  a  memory  region  b  is  valid  in 
memory  m 

va  lid  m  b  =  b  <  nextblock  m 

when  its  index  (a  natural  number)  is  less  than  that  of  the  first  region  in 
m's  free  list  (CompCert  models  allocation  deterministically,  via  a  pointer 
nextblock  to  the  next  free  region  in  m).  Once  a  region  has  been  allocated, 
it  remains  valid  for  the  duration  of  an  execution,  even  after  the  region  is 
freed. 

We  say  an  entire  memory  m  is  valid 

memwalid  m  =  inject  (fd  m)  m  m 

when  every  readable  pointer  value  in  m  points  to  a  valid  (i.e.,  allocated) 
block.  The  definition  mem.valid  is  stated  somewhat  technically,  as  the  equiv¬ 
alent  proposition  that  all  regions  in  dom(m)  are  mapped  by  the  identity 
injection  on  m  (J\d  m)  to  themselves.  This,  in  particular,  implies  that  no 
pointer  references  a  region  outside  the  domain  of  m. 

We  say  a  memory  va!  evolves  forward,  via  one  or  more  execution  steps, 
from  an  initial  memory  m  when  the  following  holds. 

forward  m  m!  = 

\/b.  valid  m  b  =>■  valid  m!  b 
A  \/z.  max.perm  m'  b  z  Cperm  max.perm  m  b  z 

Forward  captures  the  minimal  properties  that  should  hold  over  any  sequence 
of  execution  steps  in  any  language:  (1)  valid  blocks  in  m  should  remain 
valid  in  m and  (2)  execution  steps  should  only  decrease  max  permissions 
{e.g.,  via  drop.perm  operations). 

Finally,  we  say  a  memory  m'  is  unchanged  on  a  set  of  locations  L,  with 
respect  to  an  original  memory  m,  when  the  following  are  true: 

unchanged.on  m  m'  L  = 

(1)  For  all  locations  in  L,  m  and  m'  agree  on  permissions. 

\/b  z  k  p.  ( b,z )  G  L  A  valid  m  b 

= (perm  m  b  z  k  p  ■<=>  perm  m'  b  z  k  p) 

(2)  For  all  locations  in  L,  m  and  m'  agree  on  contents  (memvals). 

A  \/b  z.  ( b,z )  G  L  A  perm  m  b  z  Cur  Readable  m(b,z)  =  m!{b,z) 
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2.5  Global  Environments 

CompCert's  global  environments,  defined  by  the  following  record  type: 

Record  Genv  (F  V  :  Type)  :  Type  = 
mkGenv  { 

genv_symb  :  id  — >  option  block; 
genvTuns  :  block  — >  option  F; 
genv  vars  :  block  — >  option  (globvar  V)\ 
genv.next  :  block; 

(*  ...  properties  of  the  above  projections  ...  *) 

} 

will  also  make  an  appearance  in  later  chapters.  Each  Genv  is  parameterized 
by  a  type  F,  of  function  definitions  specific  to  a  particular  language  (for 
example,  in  Clight,  F  is  instantiated  to  the  type  of  Clight  function  defini¬ 
tions;  in  x86  assembly,  to  assembly  code  sequences),  and  by  a  type  V  of 
auxiliary  information  associated  to  global  variables  (also  language-specific). 
Parameterizing  Genvs  by  F  and  V  makes  them  suitable  for  use  in  each  of 
CompCert's  intermediate  languages  (a  feature  of  standard  CompCert). 

The  components  of  a  Genv  are: 

A  symbol  table  genv  symb  mapping  global  identifiers  to  the  memory  re¬ 
gions  in  which  they  are  allocated; 

A  function  table  genvTuns  mapping  memory  blocks  to  function  definitions; 

A  global  variable  table  genv  vars  mapping  memory  blocks  to  auxiliary  data 
attached  to  global  variables,  globvar  V  is  a  record  containing  a  value 
of  type  V,  variable  initialization  data,  and  variable  read-only  and 
volatility  status; 

A  pointer  genv.next  to  the  lowest-numbered  memory  region  that  does  not 
contain  global  data. 

We  call  the  domain  of  a  global  environment  ge  the  blocks  b  such  that 
•  there  exists  an  id  for  which  genv_symb  ge  id  =  Some  b,  or 
•be  dom(genv_funs  ge),  or 
•be  dom(genv_vars  ge). 

The  Genv  dependent  record  also  asserts  invariants  on  the  global  environ¬ 
ment  (not  shown)  such  as  "all  blocks  marked  global  by  the  Genv  are  less 
than  genv_next." 


Chapter 


3  - 

Language-Independent 

Operational  Semantics 


Interaction  semantics  is  a  language-independent  model  of  sequential  and 
(well-synchronized)  concurrent  threads.  The  core  idea  is  to  phrase  inter¬ 
action,  between  modules,  threads,  or  other  program  fragments,  as  calls  to 
external  functions.  Many  kinds  of  interaction  can  be  modeled  in  this  way, 
including  linking  (Chapter  4)  and  well-synchronized  shared-memory  con¬ 
currency  (a  future  application  of  the  results  of  this  thesis). 

I  use  the  term  external  function  to  describe  functions  callable  but  not 
defined  by  a  particular  program  unit  (declared  but  not  defined,  in  C  termi¬ 
nology).  In  a  concurrent  program,  a  thread  might  make  calls  to  the  external 
functions  lock  and  unlock.  In  a  sequential  program  composed  of  multi¬ 
ple  translation  units,  one  unit  may  call  external  functions  defined  only  by 
another  unit. 


3.1  Interaction  Semantics 

Imagine  a  multithread  shared-memory  execution.  One  can  spawn  a  new 
thread;  a  thread  may  yield  (or  block  on  a  synchronization)  and  perhaps  later 
resume;  eventually  a  thread  may  exit.  This  protocol  models  concurrency 
but  also  sequential  calls  to  separately  compiled  functions  (spawn  a  new 
"thread"  to  run  the  call,  block  until  it  returns)  and  single  threads  running 
in  an  operating-system  context  with  system  calls.  When  a  thread  yields 
(or  calls  a  sequential  external  function),  its  local  state  including  stack  and 
registers  will  be  preserved  until  it  resumes,  but  the  state  of  most  of  memory 
may  have  changed  arbitrarily  upon  resumption. 
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Interaction  semantics  (Figure  3.1)  are  a  general  formulation  of  this  thread 
protocol.  At  a  high  level,  an  interaction  semantics  ( G,  C ,  M)  is  a  partitioning 
of  a  thread's  state  into  a  global  environment  ( G ),  a  local  part  ( C ),  which  we 
call  the  core  state,  or  core,  and  which  typically  includes  both  the  control 
continuation  and  local  variable  environment,  and  a  shared  part  (M),  which 
we  typically  identify  with  shared  memory.  V  is  the  type  of  values,  and  J- 
is  the  type  of  external  function  names.  In  a  (concurrently  or  sequentially) 
multithreaded  system,  different  cores  could  have  different  core  types  (C) 
and  different  corestep  relations.  This  permits  interoperation  of  modules 
written  in  different  languages. 

With  this  partitioning  comes  a  step  relation  (corestep)  on  core  states  and 
memories  that  defines  the  small-step  operational  model  of  the  interaction 
semantics.  We  will  often  write  the  corestep  relation  as  ge  b  c,  m  \ — »  c' ,  m' . 
The  global  environment  ge  maps  functions  to  their  definitions  and  does  not 
vary  over  steps. 

We  say  an  interaction  semantics  sem  is  deterministic  when  its  underly¬ 
ing  core  step  relation  is  deterministic. 

Definition  1  (deterministic  sem). 

Vm  m' m"  ge  c  c'  c" . 

ge  b  c,  m  i — >  c' ,  m'  A  ge  \~  c,  m  i — >  c" ,  m"  c'  =  c"  A  m'  =  m" 

To  enforce  the  protocol  described  above,  we  divide  core  states  into  the 
five  lifetime  stages.  Initial  cores  result  directly  from  the  creation  of  the  thread 
or  initialization  of  the  program  using  initiaLcore.  Typically,  an  initial  core 
contains  an  empty  local  environment,  together  with  a  control  continuation 
consisting  of  a  single  function  call  (the  V  parameter  in  the  definition,  a 
function  pointer  value),  with  arguments  (list  V).  For  a  standalone  program, 
this  function  is  the  entry  point  main  (as  initialized  by  the  operating  sys¬ 
tem/program  loader);  for  a  thread,  it  is  the  function  that  was  forked;  for 
a  call  to  a  separately  compiled  module,  it  is  the  called  function.  In  gen¬ 
eral,  each  module  entry  point  corresponds  to  an  initiaLcore,  at  the  point  at 
which  that  entry  point  is  called;  internal  function  calls  (to  functions  defined 
within  the  current  module)  do  not  call  initiaLcore  but  instead  are  handled 
internally,  by  the  corestep  relation  of  the  defining  semantics. 

At.external  cores  are  those  initiating  an  external  function  call.  In  C  ter¬ 
minology,  external  functions  are  just  functions  that  are  declared  within  the 
current  translation  unit  or  module  but  which  are  defined  elsewhere  ( e.g ., 
in  a  module  that  is  later  linked  to  the  current  one).  After  external  cores  re¬ 
sult  from  resumption  of  the  thread  or  program  after  an  external  call.  In 
the  transition  from  after.external  to  a  running  state,  a  core  is  expected  to 
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at_external 

a  /  i 

initial  core  - >-  running  interference 


halted  after_external 


initiaLcore 

at.external 

after_external 

halted 

corestep 


G  — >  V  — >■  list  V  — >  option  C 
C  — »■  option  (J7  x  list  V) 
option  V  — »  (7  — >•  option  C 
C  — >  option  V 

Prop 


Figure  3.1:  Interaction  Semantics  interface.  The  types  G  (global  environ¬ 
ment),  C  (core  state),  and  M  (memory)  are  parameters  to  the  interface.  J~  is 
the  type  of  external  function  identifiers.  V  is  the  type  of  values,  and  Prop  is 
Coq's  type  of  propositions.  Prop.  By  convention,  initiaLcore  takes  a  pointer 
of  type  V  to  the  function  to  be  called,  rather  than  a  function  identifier  T . 
The  names  initiaLcore,  at_external,  after_external,  halted  are  not  constructors, 
but  are  (proved)  disjoint  predicates. 


incorporate  the  return  value  (option  V)  into  its  local  variables  (in  its  own 
language-dependent  way).  Halted  cores  are  just  that:  threads  or  programs 
that  have  terminated  normally,  yielding  an  optional  return  value  (option  V). 
Finally,  running  cores  are  neither  blocked  on  an  external  function  call  nor 
halted. 


3.2  Examples 

3.2.1  CompCert  Clight 

As  an  example  of  an  interaction  semantics,  I  show  CompCert  Clight  [BDL06]. 
This  high-level  subset  of  C  is  the  target  of  CompCert's  first  translation  phase 
(from  the  full  CompCert  C  language).  It  serves  as  a  natural  interface  be¬ 
tween  CompCert,  user-level  program  logics,  and  verified  static  analyses. 

Figures  3.2  and  3.3  give  the  syntax  of  Clight.  The  syntax  of  expressions 
a  is  standard.  In  the  statement  syntax,  for  and  while  loops  have  already 
been  translated  (in  an  earlier  compiler  phase)  to  combinations  of  the  more 
primitive  Sloop  and  Sbreak  constructs.  The  details  of  local  control  flow  (loop, 
if,  break,  continue,  switch,  goto)  are  standard  CompCert  1.13  Clight,  and 
not  relevant  to  (or  changed  by)  our  work  on  external  interaction. 
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Statements 


Sskip 

no-op 

Sassign  a\  a,2 

lval  rval 

Sset  id  a 

temp  rval 

Sea  II  optid  a  a 

function  call 

Sbuiltin  optid  f 

f  a  intrinsic 

Ssequence  si  S2 

sequence 

Sifthenelse  a  si 

S2  conditional 

Sloop  Si  S2 

infinite  loop 

Sbreak  Sreturn  aopt  break/return 

Scontinue  s 

continue  statement 

Switch  s 

switch  statement 

Slabel  l  s 

introduce  new  label 

Sgoto  l 

unconditional  jump 

Functions 

int  |  long  ptr  r 

•  •  •  C  types 

•  |  {id, t),7 

typing  environments 

return  r 

function  return  type 

params  7 

function  parameter  typing 

<  locals  7„ 

local  variable  typing 

temps 

temporary  variable  typing 

body  s 

function  body 

r 

7  = 


fi 

f  ::  =  Internal  /,:  |  External  idf  r  r 


Figure  3.2:  Syntax  and  semantics  of  Clight  (excerpts),  optid  in  Sea  1 1  and 
S  built  in  statements  is  the  (optional)  variable  in  which  to  store  the  return 
value  of  the  function  (may  be  None  if  the  function  has  void  return  type). 
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Continuations 


K  ::=  Kstop 

safe  termination 

Kseq  s  K 

sequential  composition 

Kloop  Si  S2  K 

loop  continuation 

Kswitch  k 

catch  switch  break 

Kcall  optid  f% 

pv  pt.  k  catch  function  return 

Core  States 

pv  — 

■  (id,  ( loc  x  r)),pv 

addressed  var.  environment 

pt  ■■■■= 

■  (id,  v),  pt 

temporaries  environment 

c  ::  = 

Stated  ft  s  k  pv  pt 

"running"  states 

| 

CallState  f  v  k 

call  (internal  or  external)  function  / 

1 

ReturnState  v  k 

return  from  (internal  or  external)  function 

Figure  32 

3:  Syntax  and  semantics  of  Clight  (continued).  Continuations  and 

core  states  appear  only  in  the  operational  semantics. 


Functions  /  are  either  internal  (defined  in  the  current  translation  unit)  or 
external  (declared  here  but  defined  elsewhere).  Internal  functions  comprise 
a  record  containing  the  function  return  type,  a  list  of  function  parameters 
with  their  types,  a  local  variable  environment  for  address-taken  variables, 
a  temporaries  environment  for  the  rest  of  the  function  variables,  and  the 
function  body.  External  function  records  contain  an  external  function  iden¬ 
tifier  idf,  a  list  of  argument  types  f  and  a  return  type  r,  where  r  is  a  C  type 
int,  long,  ptr  T,  etc.  External  functions  do  not  contain  a  function  body.  The 
interaction  semantics  for  Clight  will  stop  at  external  calls  and  yield  control 
to  the  execution  environment.  By  convention,  I  use  /  and  f%  to  range  over 
function  definitions  (Figure  3.2),  while  idf  range  over  function  names. 


Semantics.  The  semantics  of  Clight  depends  on  continuations  k,  described 
in  the  figure,  and  core  states  c,  which  come  in  three  varieties:  Normal 
states  StatecL  model  the  "running"  states  of  a  Clight  program,  during 
evaluation  of  anything  but  function  calls,  and  consist  of  the  current  func¬ 
tion  being  executed  /,;,  the  function  body  s,  the  control  continuation  k, 
and  two  environments,  pv  for  mapping  address-taken  stack  variables  to 
their  locations  in  memory,  and  pt  for  mapping  temporary  variables  to  their 
values. CallState  f  v  k  models  Clight  programs  that  are  about  to  call  func¬ 
tion  /  (either  internal  or  external)  with  arguments  v  and  continuation  k. 
ReturnState  v  k  gives  the  state  that  results  after  returning  from  function 


36 


CHAPTER  3.  LANGUAGE-INDEPENDENT  SEMANTICS 


geC  a  tyPv,Ptlm  vf  ge  b  a  tyPv,Pt,m  v  ge  vf  =  Some  / 
typeOf  /  =  Tfunction  f  r 


ge  b  (StatecL  fo  (Sea  1 1  idopt  a  a)  k  pv  pt),  m  i — > 
(CallState  /  F  (Kcall  idopt  fo  pv  Pt  %)), m 


(Scall) 


noRepeat  (params /j  U  locals /j) 
allocVarsp0  m  (params  fa  U  locals/))  =  Some  (pv,mi) 
bindParams  pv  m\  (params  /j)  b  =  Some  m' 

ge  b  (CallState  (Internal  fa)  v  x),m  i — > 

(StatecL  fa  (body/j)  k  pv  (initTempEnv  (temps  /*))),  m! 

(CallInternal) 

Figure  3.4:  Call  rules  from  the  operational  semantics  of  Clight 


calls  (either  internal  or  external),  v  is  the  value  returned  by  the  callee;  k  is 
the  continuation  to  be  executed  after  the  call  returns. 

Figure  3.4  shows  our  reformulation  of  the  function  call  rules  of  the 
Clight  operational  semantics.  The  operational  semantics  is  a  three-place  re¬ 
lation  on  global  environments  ge  :  G,  initial  configurations  (c,  m)  and  final 
configurations  ( c',m ').  Here  c  is  a  core  state;  m  is  a  CompCert  memory. 
The  relation  ge  \~  a  faPv/pt/m  v  denotes  big-step  evaluation  of  expression  a 
to  value  v  in  global  environment  ge,  local  variable  environment  pv,  tempo¬ 
raries  environment  pt,  and  memory  m. 

The  Scall  rule  steps  a  run  state  StatecL  calling  function  a  with  argu¬ 
ments  a  (Scall  idopt  a  a)  to  a  CallState.  The  result  of  the  call  is  stored  in  idopt. 
The  current  function  context  fo  is  pushed  into  the  return  continuation  Kcall. 
/  is  the  function  being  called,  and  may  be  either  internal  or  external. 

The  CallInternal  rule  steps  into  a  function  body.  Function  param¬ 
eters  and  locals  are  stored  in  memory:  allocVars  allocates  a  new  memory 
region  for  each  parameter /local,  producing  variable-location  mapping  pv. 
bindParams  writes  the  function  arguments  v  into  the  parameter  locations  in 
memory.  There  is  no  corresponding  rule  for  external  function  calls  (they 
are  at_external). 

I  define  the  at.external  function  of  interaction  semantics  as  a  straightfor¬ 
ward  match  on  a  core  state  c,  returning  Some  (f,v)  when  c  is  a  CallState, 
/  is  external,  and  the  arguments  v  to  /  are  well-defined  (not  CompCert's 
Vundef  value),  and  None  otherwise: 
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cLat_external  c  :  option  (J7  x  list  V)  = 
case  c  of 

|  StatecL _ — >  None 

|  Callstate  (Internal  /*)  tl  x  — >  None 
|  Callstate  (External  f  f  r)  v  K  — > 

if  defined  v  then  Some  (/,  v)  else  None 
|  ReturnState  _  _  — >■  None 

Clight  after.external  injects  the  return  value  of  an  external  function  into 
a  Clight  core  state  as  follows: 

cl_after_external  vopt  c  :  option  C  = 
case  c  of 

|  CallState  /  v  k  — >■ 
case  /  of 

|  Internal  _  — >  None 
|  External  idf  f  T-) 
case  vopt  of 

|  None  — >•  Some  (ReturnState  vundef  k) 

|  Some  v  — >  Some  (ReturnState  v  k) 

|  _  — >  None 

First,  we  check  whether  c  is  a  CallState,  with  continuation  k.  If  it  is,  and  the 
function  being  called  was  external,  we  produce  a  ReturnState  with  return 
value  v  (whenever  vopt  was  Some  v)  and  vun(i(,f  (whenever  vopt  was  None). 
In  the  vopt  =  None  case,  as  long  as  the  external  function  that  was  called  has 
void  return  type,  the  value  vundef  will  never  be  used  by  the  caller.  In  all 
other  cases,  we  just  return  None. 

The  definition  of  initiaLcore  ge  v  v  is  also  straightforward,  since  function 
arguments  are  passed  not  on  the  stack  but  abstractly,  without  reference 
to  memory:  we  check  that  v  is  a  valid  pointer  to  a  defined  function  fu 
check  that  the  arguments  v  are  defined  and  match  f,  's  type  signature,  then 
introduce  state 

CallState  (Internal  /))  v  Kstop 

which  immediately  steps  to  the  body  of  function  with  the  initial  local 
variable  environment  pv  that  maps  the  function's  formal  parameters  to  its 
arguments  v.  The  definitions  of  initiaLcore  in  the  languages  below  Clight 
follow  a  similar  regime — all  the  way  down  to  CompCert's  Linear  language, 
which  uses  an  environment  of  abstract  locations  such  as  incoming  parameter 
stack  slots  to  represent  the  state  of  the  stack  and  registers. 
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Readers  familiar  with  CompCert  (versions  1.5  through  2.4)  will  observe 
the  proximity  of  our  definition  to  Leroy  et  al.' s  presentation:  our  adapta¬ 
tion  removes  the  memory  components  from  the  state  constructors  StatecL/ 
CallState,  and  ReturnState  and  adds  the  definitions  of  after_external  and  so 
on.  The  operational  semantics  arises  by  refactoring  the  existing  definition  to 
match  these  state  representation  changes  and  by  removing  the  rule  for  exter¬ 
nal  function  calls:  such  calls  are  handled  directly  by  interaction  semantics 
at.external  and  after_external. 

3.2.2  CompCert  x86  Assembly 

Adapting  x86  assembly  (Figure  3.5)  is  a  bit  trickier,  since  arguments  must 
be  passed  concretely,  on  the  stack.  (The  same  applies  to  CompCert's  Mach 
language.)  As  we  will  see  in  Chapter  4,  we  use  the  i  n  itia  I  core  function  of 
the  interaction  semantics  interface  to  model  both  program  initialization  (i.e., 
by  the  loader)  and  the  function  calls  that  occur  at  cross-module  function 
invocations.  If  we  knew  that  all  modules  in  our  program  were  written  in 
x86  assembly  and  used,  e.g.,  the  standard  cdecl  calling  convention  ("C 
declaration",  parameters  in  pushed  in  reverse  order  on  the  stack),  then 
modeling  cross-module  invocations  would  be  less  of  an  issue:  The  shared 
calling  convention  would  mean  that  arguments  to  one  function  (say,  B.g) 
would  be  placed  by  a  caller  A.f  on  the  stack  or  in  registers  exactly  as 
expected  by  function  B.g. 

But  the  restriction  to  a  shared  calling  convention /ABI  is  rather  limiting. 
We  want  to  be  able  to  model,  at  least  abstractly,  the  interactions  of  modules 
in  a  variety  of  languages,  at  both  higher  and  lower  levels  of  abstraction. 
To  accomplish  this,  we  apply  a  "marshalling"  transformation  to  the  x86 
language:  To  initialize  a  new  x86  core,  calling  function  bj  with  arguments  v, 
we  produce  state  Asm.CallStateln  bj  v,  which  immediately  steps  to  a  running 
State.  The  initiaLcore  function  as  well  as  the  operational  semantics  rule  that 
steps  an  Asm.CallStateln  to  a  running  StateASM  are  given  in  Figure  3.6.  As  a 
side  effect  of  this  step  we  allocate  a  "dummy"  stack  frame  in  memory  in 
which  we  store  the  incoming  arguments  v,  in  right- to-left  cdecl  order  as 
expected  by  CompCert  and  gcc.  (Asm.CallStateOut  performs  the  symmetric 
step  of  marshalling  arguments  out  of  memory.) 

In  initial  core,  we  first  check  (line  3)  that  v  is  a  function  pointer.  If  it  is, 
we  look  up  the  function  body  f%  associated  with  the  pointer  (line  5),  if  any, 
and  then  check  that  the  arguments  to  the  function  match  f's  type  signature 
(line  9),  the  arguments  are  defined  (line  10),  and  that  the  arguments  are 
representable  in  memory  (also  line  10).  The  last  check  is  subtle:  it  is  possible 
that  the  arguments  v  overflow  the  address  space,  in  which  case  the  values 
written  into  the  initial  stack  frame  do  not  directly  match  v.  The  2  in  this 
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Registers 


n 

::=  EAX  |  EBX 

|  ECX 

EDX 

integer  registers 

|  ESI  |  EDI 

EBP  | 

ESP 

rf 

::=  XMMO  |  •  • 

•  XMM7 

floating-point  registers 

CCstate 

::=  ZF  CF  PF  SF 

OF 

control  register  state 

r 

::=  PC  |  IR  n  | 

FR  Tf 

collected  registers 

| STO | CR 

cr 'state 

RA 

rs 

£ 

II 

register  environments 

Instructions 

p  ::=  MOVrr  ri  r*  |  MOVri  r \  i  \  •  •  •  moves 

|  JMPl  l  |  JMPs  id  |  JMPC  cond  /  |  •  •  •  jumps 

CALLs  id  |  CALLr  r*  |  RET  |  •  •  •  calls/return 

moves  with  conversion,  integer  arithmetic,  etc. 

Load  Frames 

If  ::  =  mkLoadFrame  bf  Tq 

Core  States 


d  ::  =  StateASM  rs  If 

Asm.CallStateln  bf  v 
Asm.CallStateOut  (bf,  Tq,  Tq)  v  rs 


normal  states 
marshall  args.  in 
marshall  args.  out 


Figure  3.5:  Syntax  and  semantics  of  CompCert  x86  assembly  (excerpts). 
Core  states  appear  only  in  the  operational  semantics.  Int-floatness  types  To 
are  int,  float,  long,  or  single.  Load  frames  (mkLoadFrame  bf  Tq)  store  a  pointer 
bf  to  a  (copied)  stack  frame  containing  incoming  arguments,  as  well  as  the 
return  type  Tq  of  the  function  that  was  initialized. 


40 


CHAPTER  3.  LANGUAGE-INDEPENDENT  SEMANTICS 


1 

2 

3 
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10 

11 
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asmanitiaLcore  ge  v  v  :  option  Casm  = 
case  v  of 
I  Vptr  bf  i  — ¥ 
if  i==0 

then  case  find_funct_ptr  ge  bf  of 
|  None  — »  None 
|  Some  (Internal  /)  — > 

let  tys  =  sig_args  (funsig  /)  in 
if  vals_have_types  v  tys 

&&  defined  v  &&  4*(2*length  v)  <  max_unsigned 
then  Some  (Asm.CallStateln  b  v  tys) 

else  None 
I  _  — *  None 


argsLen  v  r  =  Some  z  alloc  m0(4u)  =  (m2,  bstk ) 
storeArgs  m2  bstk  v  f  =  Some  m' 

rso  =  empty  with  {PC  :=  Vptr  bf  0}{RA  :=  0} { ES P  :=  Vptr  bstk  0} 

ge  h  (Asm.CallStateln  bf  v  f  r),m  1 — >  (StateASM  rs0  (mkLoadFrame  bstk  T)),m' 

(AsmInit) 

Figure  3.6:  x86  initialization.  Top  is  x86  initiaLcore.  Bottom  is  the  operational 
rule  that  steps  an  Asm.CallStateln  to  a  running  StateASM- 


line  conservatively  approximates  value  encoding:  doubles  and  long  long 
integers  are  encoded  in  CompCert  x86  as  two  32-bit  words.  The  4  specializes 
bytes-per-word  in  (32-bit)  x86. 

The  AsmInit  operational  rule  stores  the  v  into  a  freshly  allocated 
dummy  stack  frame.  First,  we  calculate  the  size  of  the  stack  frame  (argsLen). 
The  types  f  are  passed  as  a  second  argument  to  facilitate  value  encoding. 
Then  we  allocate  a  block  of  size  4*  z  (because  4-byte  words)  and  store  the 
arguments  (storeArgs)  into  the  allocated  block  bstk .  rso  is  the  initial  register 
state  for  the  module.  It  sets  PC  to  function  pointer  Vptr  bf  0,  return  address 
register  RA  to  0,  and  stack  pointer  register  ESP  to  Vptr  bstk  0,  a  pointer  to 
the  allocated  dummy  stack  frame.  When  we  step  from  Asm.CallStateln  to 
the  running  state  StateASM/  we  record  the  block  address  bstk  of  the  dummy 
stack  frame  and  the  return  type  r  of  the  initial  function  as  a  load  frame 
(mkLoadFrame).  We  use  the  load  frame  (a  state  component  not  present  in 
original  CompCert's  x86)  to  express  simulation  invariants  on  the  initial 
stack  frame  in  the  proof  of  the  translation  to  x86. 
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Discussion.  For  x86  modules  that  share  the  same  calling  convention,  the 
modeling  step  I  describe  above  does  not  occur  at  runtime  (nor  does  the  com¬ 
piler  output  any  "marshalling"  or  copying  code).  The  advantage  of  sticking 
to  the  abstract  "calling  convention"  imposed  by  initiaLcore,  in  which  values 
are  passed  abstractly  instead  of  in  memory  and  registers  according  to  a 
particular  calling  convention,  is  increased  flexibility  to  model  the  interac¬ 
tions  of  modules  in  a  wide  variety  of  languages.  By  a  variety  of  languages,  I 
mean  not  only  Clight  and  x86  but  also  x86  modules  following  different  call¬ 
ing  conventions,  such  as,  e.g.,  cdecl  and  Microsoft  fastcall  (for  which 
additional  code  would  have  to  be  inserted  at  linktime). 

On  the  other  hand,  it  is  not  immediately  obvious  that  the  use  of  a 
"dummy"  stack  frame  accurately  models  the  interactions  of  linked  mod¬ 
ules  running  on  a  real  machine,  even  when  the  modules  share  the  same 
calling  convention.  Take,  for  example,  the  following  C  function: 

int  f (int*  p,  int  x)  { 

x  =  0 ; 

*p  =  l; 

return  x; 

} 

The  function  f  takes  two  arguments,  an  integer  pointer  p  and  an  integer 
x.  First,  it  assigns  x  the  value  0.  Then  it  writes  the  value  1  to  memory  at 
location  p,  and  returns  x.  If  we  compile  and  link  f  with  the  code 

extern  int  f (int*  p,  int  x) ; 

int  main (void)  { 
int  a; 

return  f ( &a,  0 )  ; 

} 

in  a  second  translation  unit,  the  resulting  x86  program  returns  0,  as  ex¬ 
pected.  In  fact,  f  should  return  0  regardless  of  the  values  of  x  and  p.  For 
example,  it  is  sound  to  rewrite  this  function,  by  simple  constant  propaga¬ 
tion,  to: 


int  f (int*  p,  int  x)  { 

*p  =  1; 

return  0 ; 

} 

The  question  is:  does  our  x86  semantics  adequately  model  this  behavior? 

Graphically,  at  the  point  of  the  call  to  f  in  main,  the  stack  looks  like 
this: 
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Argl 

ArgO 


0 


&  3 


Stack 


The  arguments  have  been  pushed  in  right-to-left  order.  In  the  body  of  f, 
the  write  to  p  corresponds  to  a  write  to  address  &  a  (as  loaded  from  address 
ArgO);  the  x  returned  is  equal  to  the  value  at  address  Argl  (equal  0). 

The  x86  semantic  model  produces  a  slightly  different  runtime  state.  In 
order  to  handle  the  call  to  f,  it  initializes  a  new  x86  core  which  immediately 
allocates  a  fresh  memory  region  in  which  to  store  copies  of  the  function 
arguments.  The  resulting  state  is  the  following: 


Argl 

0 

uiu 

ArgO 

&  ci 

Argl7 

0 

ArgO7 

&  cl 

Stack 


Argl7  and  ArgO7  are  the  addresses  of  the  copied  arguments.  In  this  case, 
the  copying  does  not  change  the  behavior  of  the  program  (x  and  p  have  the 
same  values  as  before).  In  general,  when  the  arguments  on  the  stack  are  not 
written  to  by  the  callee,  the  copying  semantics  simulates  the  no-copying 
semantics,  meaning  copying  is  a  sound  abstraction. 

But  it  is  also  possible  that  p  aliases  the  stack  location  at  which  the  x  ar¬ 
gument  to  f  is  passed,  leading  f  to  inadvertently  mutate  its  first  parameter. 
For  example,  the  following  x86  assembly  code,  due  to  Tahina  Ramananan- 
dro: 


main : 


pushl 

0 

pushl 

%esp 

call 

f 

popl 

%eax 

popl 

%eax 

ret 
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causes  f  to  mutate  x,  by  first  pushing  the  argument  corresponding  to  pa¬ 
rameter  x  (equal  0)  onto  the  stack,  then  pushing  %esp — the  address  at 
which  0  was  just  stored — at  the  parameter  position  corresponding  to  p.1 * 
Graphically,  the  situation  is: 

With  copying 


Without  copying 

Stack  Arql 
ArgO 

Argl/ 
ArgO7 


Argl 

ArgO 


1 

r 

Argl 

<• 

i 

Argl 

0  / 

Argl  * 

Stack 


The  stack  layout  without  copying  is  on  the  left.  The  layout  with  copying 
is  on  the  right.  The  pointer  at  address  ArgO  still  points  to  the  Argl  mem¬ 
ory  location,  even  after  copying.  The  incoming  parameter  x  resolves  to  the 
memory  cell  at  location  Argl'  with  copying,  and  to  Argl  without,  resulting 
in  two  different  return  values  depending  on  whether  copying  occurs. 

In  this  case,  the  no-copying  semantics  produces  the  wrong  result  (as  we 
mention  above,  f  should  return  0  regardless  of  its  parameters).  But  where 
does  the  fault  lie?  Is  it  the  compiler?  gcc  -OO,  for  example,  uses  the  no¬ 
copying  semantics  on  the  left,  resulting  in  return  value  1  when  linked  with 
the  assembly  code  implementation  of  main  above.  But  to  further  confuse 
matters,  gcc  -02  turns  on  constant  propagation,  resulting  in  the  "correct" 
return  value  0.  Constant  propagation  should  at  least  be  sound  for  f .  At  the 
same  time,  the  compiler  should  not  be  forced  to  copy  in  order  to  produce 
correct  code,  since  copying  is  expensive. 

The  better  answer  is  that  the  assembly  code  implementation  of  main 
is  at  fault.  We  just  should  not  pass  pointers  to  stack-allocated  parameters 
when  calling  external  functions  from  assembly  contexts.  Since  gcc  is  the  de 
facto  standard  C  compiler,  and  it  implicitly  requires  well-behaved  contexts 
that  do  not  alias  parameter  slots,  then  so  should  we. 

One  rationale  here  is  that  the  outgoing  parameters  to  an  external  call, 
while  technically  allocated  in  the  caller's  stack  frame,  are  fresh  locations  in 
a  sense  "owned"  by  the  callee.  The  caller  should  not  be  allowed  to  gener¬ 
ate  and  pass  pointers  to  these  locations.  More  pragmatically,  the  compiler 
should  not  be  required  to  copy  in  order  to  produce  correct  code,  and  to  val- 


1pushl  %esp  pushes  the  value  of  %esp  as  it  existed  before  the  stack  pointer 

is  decremented.  [Int,  Volume  2,  Chapter  4  (Instruction  Set  Reference,  N-Z),  PUSH] 
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idate  common  optimizations  such  as  constant  propagation,  in  which  case 
the  program  context — not  the  no-copying  compiler  (gcc) — is  at  fault. 

When  implementing  modules  in  C,  the  issue  does  not  arise  quite  as 
directly:  The  outgoing  arguments  of  a  C  external  function  call  are  not 
laid  out  concretely,  i.e.,  on  the  stack,  at  the  level  of  the  C-programming- 
language  abstraction.  A  C  program  cannot  manipulate  pointers  to  these 
(nonexistent)  locations.  When  compiling  from  C  to  assembly,  it  is  possible 
to  show  (though  I  have  not  proved  this  theorem  in  Coq)  that  the  generated 
assembly  code  follows  the  no-pointers-to-parameters  policy:  Because  the 
parameter  stack  slots  introduced  by  register  allocation  are  fresh  memory 
locations  (they  are  not  allocated  at  all  until  the  register  allocation  phase), 
there  are  no  existing  pointers  in  memory  to  these  locations.  Nor  does  the 
compiler  introduce  new  pointers  to  these  locations. 


3.2.3  Gallina  Semantics 

The  interaction  semantics  abstraction  is  suitable  for  expressing  more  than 
just  traditional  programming  language  semantics  ( e.g .,  the  one  for  Clight 
given  in  Section  3.2.1,  or  x86  in  Section  3.2.2,  or  any  of  the  other  CompCert 
intermediate  languages).  Interaction  semantics  are  general  enough  to  model 
arbitrary  relations  over  values  and  memories.  This  section  demonstrates  by 
constructing  an  interaction  semantics  of  relations  in  Coq's  specification  lan¬ 
guage,  Gallina  (which  is  in  turn  suitable  for  expressing  all  of  mathematics). 
States  in  the  Gallina  semantics  have  type 

gallinaState  =  option  (gallina Rel  x  list  V) 

where  by  gallina  Rel  I  mean  the  type  of  relations 

gallinaRel  =  V(y  :  list  V)(mpre  :  M)(mpost  :  M ).  Prop 

that  map  a  list  of  argument  values  v,  and  pre-  and  post-memories  mpre 
and  mp0st  to  Coq  propositions  (type  Prop).  The  option  in  the  definition 
of  gallinaState  is  used  to  indicate  whether  a  Gallina  semantics  has  been 
"executed"  yet.  If  the  option  is  None,  then  the  Gallina  semantics  is  halted. 


halted  (c  :  gallinaState)  :  option  V  = 
case  c  of 
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Otherwise,  we  take  a  step,  provided  the  following  conditions  hold. 

corestep  (ge  :  G)  (c  :  gallinaState)  (m  :  M )  (c;  :  gallinaState)  ( m '  :  M)  = 
(37?  :  gallinaRel.  3v.  c  =  Some  ( R,v )  A  R  v  rn  m')  A  c'  =  None 

In  other  words,  a  Gallina  semantics  steps  from  configuration  c,  m  to  new 
configuration  c' ,  m'  provided  that  c  is  Some  ( R ,  tl),  for  some  relation  7?  and 
initial  arguments  v,  the  relation  R  is  satisfied  by  v,  m,  and  m'  (R  v  m  m'), 
and  c'  is  halted  (o'  =  None).  The  key  point  is  that  the  semantics  is  stuck 
whenever  the  relation  R  does  not  hold. 

The  definitions  of  the  other  required  functions  (at  external,  after  external, 
initiaLcore)  are  straightforward.  We  simply  say  that  the  semantics  is  never 
blocked. 

at_external  (c  :  gallinaState)  :  option  (extTun  x  list  V)  =  None 
afterExternal  ( vopt  :  option  V)  (c  :  gallinaState)  :  option  gallinaState  =  None 

To  construct  an  initial  Gallina  core,  we  parameterize  by  a  relation  R  : 
gallinaRel  and  instantiate  the  initial  core  with  this  relation. 

initiaLcore  (7?  :  gallinaRel)  (ge  :  G)  (v  :  V)  (v  :  list  V)  :  option  gallinaState 
=  Some  (R,v) 

There  are  many  other  ways  in  which  such  a  semantics  can  be  expressed. 
The  relation  R  can  enforce  that  m  =  in' ,  in  which  case  we  have  a  semantics 
of  (unary)  assertions  on  memory  states.  It  is  also  possible  to  parameterize 
R  not  only  by  the  arguments  to  initiaLcore,  but  also  by  the  value  v  (typically 
of  form  Vptr  b  z )  that  identifies  the  function  (e.g.,  at  block  address  b )  that 
this  core  was  spawned  to  handle. 


3.2.4  Trace  Semantics 

Gallina  semantics  (Section  3.2.3)  demonstrated  an  interaction  model  of 
arbitrary  relations  on  function  arguments  and  memories.  But  the  relations 
R  which  I  employed  in  that  section  were  history  independent,  in  the  sense 
that  they  did  not  predicate  over  the  history,  or  trace,  of  external  function 
call  events  produced  by  program  executions. 

In  this  section,  I  demonstrate  an  interaction  semantics  that  does  record 
interaction  traces.  This  trace  semantics,  T,  differs  from  the  semantics  shown 
previously  for  Clight,  x86,  and  Gallina  in  that  it  is  an  operator,  or  functor, 
over  interaction  semantics.  As  input,  T  takes  an  interaction  semantics  sem  : 
Semantics  G  C  M  and  an  axiomatization,  spec,  of  the  external  functions  that 
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may  be  called  by  sem ;  as  output,  it  produces  a  new  interaction  semantics 
in  which  core  states  have  been  augmented  to  record  the  interaction  trace. 
Core  states  of  the  resulting  semantics  are  products  of 

•  C  states, 

•  a  program  trace  trs,  and 

•  an  external  state  (modeling  the  configuration  of  the  environment). 

In  the  definition  of  T,  we  thread  external  states  through  the  semantics.  The 
external  states  themselves  are  only  updated  over  external  function  calls. 

At  external  call  points,  we  use  spec,  which  contains  pre-  and  postcondi¬ 
tions  for  each  external  function,  to  specify  the  pre-  and  postmemories  of 
the  external  function  being  called,  as  well  as  the  effect  of  that  function  on 
the  external  state  (for  example,  an  external  call  to  fopen  might  change  the 
state  of  a  model  of  the  filesystem  to  specify  that  a  particular  function  is 
now  open).  The  details  are  as  follows. 

Traces  (of  finite  program  executions)  are  defined  as  lists  of  events 

trs  G  trace  =  list  event 

where  by  event  I  mean  a  record  of  the  input-output  behavior  of  a  call  to  an 
external  function: 

Record  event  :  Type  = 

{  ef  :  ident; 
args  :  list  V; 
retv  :  option  V; 
pre  :  M; 
post  :  M  } 

The  field  ef  is  the  identifier  of  the  external  function  that  was  called.  Values 
args  are  the  arguments  to  ef;  retv  is  the  optional  return  value,  pre  and 
post  are  the  memory  states  at  the  external  function  call  and  return  points 
respectively. 

We  define  the  core  states  of  trace  semantics  as  the  type 

Ctr  —  C  x  list  event  x  O 

where  O  :  Type  is  the  additional  parameter  to  T  that  gives  the  type  of 
external  states. 

The  step  relation  has  two  cases.  The  first  is  just  a  congruence  rule,  in 
which  we  step  the  core  state  c  and  memory  m  in  trace  semantics  configura¬ 
tion  (c,  trs,  co )  using  the  step  relation  of  the  underlying  semantics: 

ge  \~  c,m  t — )•  c' ,  m! 
ge  h  (c,  trs,co),  m  i==>  (c;,  trs,  co),  m' 


(TraceStep) 
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The  external  state  and  trace  are  left  unchanged  (internal  steps  are  not  ob¬ 
servable).  The  second  case  handles  external  function  calls. 

at.external  c  =  Some  (ef,v)  spec  ef  =  (P,  Q) 

P(x,tv,v,m)  Q(x,to' ,  vopt,  m')  after.external  vopt  c  —  Some  c' 

ge  P  {c,trs,(o),m  l==>  ( (2,  mkEvent  ef  IT’  vopt  m  m!  ::  trs ,<jj'),m! 

(TraceExternal) 

Core  state  c  is  at  external  calling  external  function  ef  with  arguments 
v  .  The  pre-  and  postconditions  of  ef  as  provided  by  spec  are  P  and  Q 
respectively.  P  is  satisfied  by  it,  the  precall  memory  m,  and  external  state  at; 
m! ,  the  return  value  vopt,  and  new  external  state  a /  satisfy  the  postcondition 
Q.  Injecting  the  return  value  vopt  into  c  using  after.external  results  in  the 
new  core  state  d .  From  initial  trace  semantics  state  (c,  trs,  to),  we  step  to 
final  state  (c1,  mkEvent  ef  it  vopt  m  m!  ::  trs,  to')  in  which  the  new  event 
mkEvent  ef  it  vopt  m  rn! ,  signaling  a  call  to  external  function  ef,  has  been 
consed  onto  the  head  of  the  current  trace. 

The  x  parameter  to  both  P  and  Q  is  a  peculiarity  of  how  we  define  func¬ 
tion  specifications  (Chapter  7):  for  each  function  identifier  ef,  spec  provides 
a  pre-  and  postcondition  that  are  parameterized  not  only  by  the  function 
arguments,  return  value,  initial  and  final  memories,  etc.,  but  also  by  a  value 
x  of  the  (dependent)  type  spec  type  ef,  where 

spec.type  :  ident  — »  Type 

is  a  function  from  external  function  identifiers  to  Type.  Using  this  spec.type 
convention  makes  it  possible  to  communicate  information,  in  a  function- 
specific  way,  between  the  precondition  P  and  the  postcondition  0.  (For 
example,  requiring  x  =  it  in  P  communicates  the  function  arguments  to 
the  postcondition.)  Binary  postconditions,  which  parameterize  Q  by  the 
initial  state  m  and  arguments  it  in  addition  to  m!  and  vopt,  serve  a  similar 
purpose. 

Trace  semantics  as  presented  above  assumes  that  external  functions 
do  not  themselves  produce  observable  events,  besides  the  single  mkEvent 
consed  onto  the  trace  above  to  mark  the  external  call.  To  lift  this  restriction, 
external  function  specifications  can  be  augmented  to  include  the  function 
trace  produced  for  given  inputs,  via  a  relation 

traceOf  :  ident  — >  list  V  — »  M  — *  trace  — >  Prop 

that  associates  an  external  function  name,  the  function  arguments,  and 
precall  memory  to  a  set  of  event  traces.  Assuming  trs '  =  traceOf  ef  it  m, 


48 


CHAPTER  3.  LANGUAGE-INDEPENDENT  SEMANTICS 


the  trace  that  results  in  rule  TraceExternal  is  then 
mkEvent  ef  v  vopt  m  m!  ::  trs'  ++  trs 

in  which  trs'  has  been  interposed  between  the  mkEvent  and  the  tail  trs. 

Another  limitation  is  that  T  does  not  deal  gracefully  with  nontermina¬ 
tion.  The  semantics  faithfully  models  programs  that  may  make  an  arbitrary 
finite  number  of  external  function  calls,  but  not  those  that  make  infinitely 
many  external  function  calls  (reactive  divergent  programs).  Nor  does  the 
trace  model  deal  adequately  with  external  functions  that  may  diverge. 

To  deal  with  the  first  problem,  it  is  sufficient  to  model  traces  coinduc- 
tively,  i.e.  as  streams  instead  of  lists.  To  deal  with  the  second  problem,  it's 
necessary  to  alter  the  specification  of  external  functions  to  permit  other 
behaviors  ( e.g .,  divergence)  in  addition  to  termination  in  a  poststate  satis¬ 
fying  Q.  Extended  behaviors  of  external  function  must  then  be  propagated 
through  the  trace  semantics,  e.g.,  by  adding  to  the  core  state  type  Ctr  addi¬ 
tional  constructors  for  the  additional  behaviors  and  by  updating  the  step 
relation  i==>  to  do  the  propagation.  These  variations  have  not  been  done  in 
Coq  (yet)  but  would  not  be  particularly  difficult  to  implement. 


3.3  Reach-Closed  and  Valid  Semantics 

In  addition  to  the  specialized  interaction  semantics  I  presented  in  the  previ¬ 
ous  section,  the  results  of  later  chapters  (in  particular.  Chapter  6)  will  rely 
on  two  further  specializations  of  the  basic  interface.  The  first  specializa¬ 
tion  is  to  what  I  call  reach-closed  semantics.  At  a  high  level,  a  reach-closed 
semantics  is  one  that  writes  only  to  memory  locations  leaked  to  it,  e.g.,  by 
following  the  reach-closure  in  memory  of  pointers  returned  to  the  module 
by  external  functions.  A  reach-closed  semantics  may  also  write  to  locations 
it  allocated  itself. 

The  second  specialization  I  describe  here  but  do  not  use  until  Chapter  6 
is  to  valid  semantics.  A  valid  semantics  is  one  that  does  not  store  invalid 
pointers  into  memory  (in  the  sense  of  the  va  I  .valid  predicate  of  Chapter  2). 
In  the  following  I  present  the  details. 

3.3.1  Reach-Closed  Semantics 

Reach-closed  semantics  are  defined  by  an  invariant  1Z  on  states  c,  memories 
m,  and  block  sets  B  that  satisfies  the  laws  given  in  Figure  3.7.2 


2File  compcomp/linking/rc.semantics  .  v. 
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Reach-Closed  Invariant 

1C  :  C  — }  mem  — >  set  block  — >  Prop 

Reach-Closed  Initial  Core 

initiaLcore  ge  v  v  =  Some  c 
— =4>  Vm.  1Z  c  m  (REACH  m  (blocksOf  v )) 

Reach-Closed  Step 

roots  ( ge  :  G)  (B  :  set  block)  =  globalBlocks ge  U  B 

E/i 

iz  c  m  B  A  ge  h  c,m  t — >  c  ,  m 
=►  (1)  E  C  REACH  m  (roots  ge  B)  A 

(2)  7^.  c;  m'  (REACH  m'  (freshblks  m  m' 

U  REACH  m  (roots  ge  B ))) 

Reach-Closed  At  External 

IZ  c  m  B  A  at.external  c  =  Some  (idf,v)  defined  v 

Reach-Closed  After  External 

let  B'  =  case  vopt  of  None  -A  B 

|  Some  v  — >  blocksOf  ( v  ::  nil)  U  B 

\nlZ  c  m  B 

A  at.external  c  =  Some  ( idf,v ) 

A  after.external  voptc  =  Some  c; 

VmC  IZ  c' m'  B' 


Reach-Closed  At  External 

IZ  c  m  B  A  halted  c  =  Some  vret  ==>  defined  ( vret  ::  nil ) 


Figure  3.7:  Reach-Closed  Semantics 
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Definition  2  (Reach-Closed  Semantics  (reach_closed)).  An  interaction  seman¬ 
tics  is  reach-closed  iff  there  is  an  7Z,  specialized  to  the  states  and  step  relation  of 
that  semantics,  that  satisfies  the  laws  of  Figure  3.7. 

The  definitions  are  parameterized  by  types  G  and  C ,  by  a  global  envi¬ 
ronment  ge  :  G,  and  by  an  interaction  semantics  of  type  Semantics  G  C  mem 
that  defines  the  step  relation  t — »  and  the  functions  after.external,  initiaLcore, 
at_external,  and  halted. 

The  7 Z  invariant  of  reach-closed  semantics  quantifies  over:  Core  states 
of  the  argument  semantics  c  :  C ,  the  memory  m  :  mem,  and  a  set  B  of 
memory  regions  that  records  the  memory  blocks  exposed  to  the  semantics 
at  interaction  points  (via  pointers  in  the  initial  argument  list,  in  the  return 
values  of  external  function  calls,  and  by  local  allocation). 

The  roots  of  a  block  set  B  and  global  environment  ge  are  the  union  of 
the  global  blocks  and  B.  The  operative  conditions  of  reach-closed  semantics 
are  those  that  characterize  the  reach-closed  step  relation  (clauses  1  and  2). 
Clause  (1)  instruments  the  step  relation  of  the  underlying  semantics  with  a 
restriction  on  the  effects  E  produced  by  the  step.  The  judgment 

i  -E  /  / 

ge  \~  c,m  \ — >  c  ,  m 

means:  configuration  c,  m  steps  to  ,  m' ,  writing  to  or  freeing  exactly  the 
locations  E?  Locations  not  contained  in  this  set  are  guaranteed  not  to 
be  modified.  E  C  REACH  m  (roots  ge  B)  states  that  this  set  of  modified 
locations  E  is  a  subset  of  the  reach-closure  (in  m )  of  the  current  roots. 

Clause  (2)  asserts  that  the  invariant  can  be  reestablished  after  the  step  for: 
the  blocks  reachable  (in  m ')  from  newly  allocated  blocks  (fresh blks  m  m'),  if 
any,  as  well  as  from  the  blocks  that  were  originally  reachable  in  m  (this  set 
is  REACH  m  (roots  ge  c) ).  This  last  condition  ensures  that  the  reachable  set 
grows  monotonically  at  each  step,  by  not  "forgetting"  locations  that  were 
previously  reachable. 

The  other  interface  laws  modify  B  as  specified  above.  For  example,  the 
clause  for  after.external  asserts  that  TZ  can  be  reestablished  for  B'  equal  to  B 
union  the  blocks  exposed  by  the  return  values  of  external  calls  (blocksOf  ( v  :: 
nil)).  initiaLcore  asserts  that  the  invariant  can  be  established  initially,  with  B 
equal  to  the  blocks  exposed  by  the  closure  of  the  initial  arguments,  in  the 
initial  memory,  at  external  and  halted  (not  shown)  assert  that  the  arguments 
to  external  calls  and  return  values,  respectively,  are  well-defined. 

Reach  closure  is  not  an  unrealistic  proof  obligation.  One  can  show,  for 
example,  that  all  Clight  programs  satisfy  the  restrictions  imposed  in  Fig- 


3See  compcomp/core/ef  f ect.semantics  .  v  for  the  formal  definition. 
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ure  3.7.  In  the  following  theorem,  CLsem  is  the  Clight  interaction  semantics 
presented  in  Chapter  3. 

Theorem  1  (Clight  is  Reach-Closed).  CLsem  is  reach_closed. 

Proof.  The  proof4  of  this  (perhaps  counterintuitive)  theorem  takes  advan¬ 
tage  of  the  fact  that  Clight  programs  never  fabricate  nonnull  pointers,  e.g., 
by  casting  an  integer  to  a  pointer  and  then  dereferencing  it.  (Even  in  stan¬ 
dard  C,  casting  an  integer  to  a  pointer,  or  vice  versa,  is  only  implementation 
defined,  except  when  the  pointer  is  null.  See,  e.g.,  the  Cll  standard  [ISOll, 
6.3.2.3].)  Also  perhaps  counterintuitively,  (safe)  C  pointer  arithmetic  does 
not  violate  the  theorem.  The  REACH  relation  that  appears  Figure  3.7,  and 
which  was  first  defined  in  Chapter  2,  is  closed  under  intrablock  pointer 
arithmetic. 

The  toplevel  invariant  1Z  c  m  B  holds  when  either  c  is  the  initial  core  and 
B  =  REACH  m  (blocksOf  v),  or  c,  m,  and  B  satisfy  the  invariant  cLcoreJnv, 
defined  by  induction  on  the  structure  of  c  as  follows. 

7 Z  c  m  B  :  Prop  = 

(3  v  v.  B  =  REACH  m  (blocksOf  v)  A  initial  core  ge  v  v  —  Some  c) 

V  cl  core  inv  c  m  B 

cLcoreJnv  c  m  B  = 

case  c  of 

|  StatecL  f  s  k  e  te  -A 

cLstateJnv  c  m  e  te 
A  REACH  m  (roots  ge  B)  C  roots  ge  B 
A  cLcontJnv  c  k  m 
|  Callstate  f  v  k  -A 

blocksOf  v  C  roots  ge  B 
A  REACH  m  (roots  ge  B)  C  roots  ge  B 
A  cLcontJnv  c  k  m 
|  Returnstate  v  k  — > 

blocksOf  (v  ::  nil )  C  REACH  m  (roots  ge  B ) 

A  cLcontJnv  c  k  m 

The  key  relation  above  is  REACH  m  (roots  ge  B)  C  roots  ge  B,  which 
asserts  that  roots  ge  B  (a  set  of  blocks,  as  defined  in  Figure  3.7)  is  closed 
under  reachability  in  memory  m. 

In  the  States  case,  the  subsidiary  relation  cLstateJnv: 


4File  comp  comp/ 1  inking/  saf  e_clight_rc  .  v. 
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cLstateJnv  c  (m  :  mem)  ( B  :  set  block)  (e  :  var_env)  (te  :  temp_env)  = 

V  x  b  (ty  :  Ctypes.type).  e(x)  —  Some  ( b,ty )  =>■  b  G  roots  ge  B 
A  V  x  v .  te(x)  —  Some  v  =>■  blocksOf  (v  ::  nil)  C  roots  ge  B 

asserts  that 

1.  The  memory  region  associated  with  every  identifier  x  in  the  address¬ 
able  local  variable  environment  e  is  reachable;  and 

2.  Every  temporary  identifier  x  in  the  temporaries  environment  te  maps 
to  a  value  whose  blocks  are  contained  in  the  current  roots. 

The  cLcontjnv  invariant  applies  cLstateJnv  recursively  to  the  call  stack: 

cLcontJnv  c  (k  :  cont)  m  = 
case  k  of 

|  Kstop  — >  True 

|  Kseq  s  k'  —>  cLcontjnv  c  k'  m 
|  Kloop  si  S2  k'  — >  cLcontjnv  c  k' m 
|  Kswitch  k'  — >  cLcontjnv  c  k'  m 

|  Kcall  id0pt  f  e  te  k'  — >•  cLstateJnv  c  m  e  te  A  cLcontjnv  c  k'  m 

It  is  tedious  (but  not  difficult)  to  verify  that  IZ  as  defined  above  satisfies  the 
laws  of  Figure  3.7.  □ 

3.3.2  Valid  Semantics 

A  semantics  is  valid5  according  to  the  following  definition. 

Definition  3  (Valid  Semantics).  A  semantics  is  valid  zvhen  there  exists  an 
invariant  T,  specialized  to  the  core  states  of  the  semantics,  that  satisfies  the  lazvs 
given  in  Figure  3.8. 

Informally,  a  valid  semantics  is  one  that  never  stores  invalid  pointers 
into  memory.  An  invalid  pointer  is  one  that  references  a  memory  region 
that  has  not  yet  been  allocated  (freed  memory  regions  are  never  invalid). 

In  the  Compositional  CompCert  proofs,  we  establish  that  a  semantics  is 
valid  by  exhibiting  an  invariant  Z  over  core  states  of  the  semantics  c  and 
memories  m  with  the  properties  given  in  Figure  3.8.  As  in  Chapters  2  and  5, 
mem  valid  m  states  that  the  memory  m  contains  no  invalid  pointers.  The 
other  definitions,  such  as  vals  valid  m  v  (the  values  v  are  all  valid  with  re¬ 
spect  to  m),  are  similar.  In  the  clauses  that  mention  the  global  environment 
ge,  ge  is  assumed  to  contain  only  valid  pointers  as  well. 


5File  compcomp/core/nucular_semantics  .  v. 
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Initially  Valid 


\/v  v  c.  initiaLcore  ge  v  v  =  Some  c 
A  vals.valid  m  v 
A  mem.valid  m  =^>  T(c,m) 

Corestep  Valid 

Vc  m  c!  m! .  X(c,  m) 

A  ge  A  c,  m  \ — >  c' ,m'  =^>  T(c',m' ) 

At  External  Valid 

\/c  m  idf  v.  I(c,  m) 

A  at.external  c  =  Some  ( idf,v ) 
=^>  vals.valid  mv  A  mem.valid  m 

A/ter  External  Valid 

V c  m  v  c' m! .  I(c,m) 

A  after.external  v  c  =  Some  c' 

A  vaLvalid  m!  v 
A  forward  m  m! 

A  mermvalid  m'  =^>  Z(c,,m') 


Halted  Valid 


Vc  m  v.  halted  c  =  Some  v 

=^>  vaLvalid  m  v  A  mem.valid  m 


Figure  3.8:  Valid  semantics  maintain  an  internal  invariant  I :  C  — >•  mem  — > 
Prop  satisfying  the  five  properties  listed  above. 
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The  Initially  Valid  clause  states  that  initializing  c  with  valid  arguments 
v  in  valid  memory  m  results  in  a  state  satisfying  Z(c,m).  One  can  think  of 
this  as  the  introduction  rule  for  the  invariant.  Corestep  Valid  asserts  that  core 
steps  of  the  underlying  semantics  preserve  Z.  At  External  Valid  states  that 
when  c  is  at  external,  Z(c,m )  implies  that  the  external  function  arguments 
v  and  memory  m  are  valid.  After  External  Valid  asserts  that  it  is  possible  to 
reestablish  Z  after  an  external  function  call  returns,  under  the  assumptions 
given  in  the  clause.  Halted  Valid  states  that  Z(c,m )  implies  that  both  the 
return  value  v  and  memory  m  are  valid  whenever  c  is  halted. 

It  is  possible  to  show  that  CompCert  x86  assembly  is  a  valid  semantics. 

Theorem  2  (CompCert  x86  is  Valid).  Asmsem  is  valid. 

Proof.  Asmsem  is  the  x86  assembly  semantics  given  in  Chapter  3.  To  prove6 
that  Asmsem  is  valid,  we  must  exhibit  an  Z  satisfying  the  laws  in  Figure  3.8. 
Let  Z  equal: 

Z  c  m  :  Prop  = 
mem_valid  m  A 
case  c  of 

|  StateASM  rs  if  “A  regset.valid  m  rs  A  loadframe.valid  m  If 
|  Asm.Callstateln  bf  v  f  r  —> 

block.valid  m  bf  A  vals.valid  m  v 
|  Asm.CallstateOut  _vrslf — >• 

regset_valid  m  rs  A  loadframewalid  m  If  A  vals_valid  m  v 

with  regset.valid  and  loadframe.valid  defined  as  follows: 

regset.valid  m  (rs  :  regset)  =  V  r.  vaLvalid  rs(r)  m 

loadframe.valid  m  ( If  :  load.frame)  = 

case  If  of  mkLoadFrame  bst ^  r  —>  block_va lid  m  bslk 

□ 


6File  compcomp/backend/Asm.nucular .  v. 


Chapter 


Language-Independent 

Linking 


In  Chapter  3,  I  introduced  interaction  semantics  in  order  to  interpret  the 
behavior  of  isolated  modules.  Trace  semantics  T  introduced  the  notion  of 
an  operator  over  interaction  semantics.  In  this  Chapter,  I  define  a  second 
operator 

£([Soi[Si] . [Sjt-iD 

over  interaction  semantics  that  models  the  linked  behavior  of  a  set  of  inter¬ 
acting  modules,  as  given  by  a  multimodule  program  P  =  So,  Si,  •  ■  ■  ,Sn- i- 
Each  Si  here  ranges  over  (the  syntax  of)  a  program  in  a  language  such  as 
Clight  or  x86  assembly  (though  linking  semantics  as  I  present  it  in  this 
chapter  is  formally  language-independent;  one  can  start  directly  from  se¬ 
mantics).  [<%]  is  the  semantics  of  such  a  module,  as  defined  by  the  Modsem 
record  below. 


4.1  Linking  Semantics 

As  input,  jC  takes  N  interaction  semantics,  each  with  (perhaps)  a  different 
global  environment  and  core  state  type  (for  example,  the  modules  may  be 
programmed  in  perhaps  different  languages).  The  global  environments  of 
the  modules  must  have  equal  domain  (map  the  same  set  of  addresses).1 


throughout  this  thesis,  C-semantics  are  assumed  to  satisfy  this  property.  See 
Section  7.3.1,  Theorem  9  for  further  discussion,  in  particular  of  why  this  assumption 
is  compatible  with  programs  that  declare  different  (but  consistent)  sets  of  global 
variables. 
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The  output  of  jC  is  a  new  interaction  semantics 

[F]  =  £([Sb],[Si] . [Sk-i]) 

that  models  the  linked  program  execution  by  maintaining  as  its  own  core 
state  a  (heterogeneous)  stack  of  the  modules'  core  states.  Each  "frame"  on 
the  stack  corresponds  to  a  runtime  invocation  of  one  of  the  modules  in  the 
program.  Cross-module  function  calls  result  in  new  cores  being  pushed 
onto  the  stack  (initialized  via  initiaLcore);  returning  from  such  a  function 
pops  the  top  core  from  the  stack  and  injects  the  return  value  into  the  state 
of  the  caller,  using  after_external. 

The  modules  5,  are  written  in  different  languages,  whose  states  may 
have  different  (Coq)  types.  In  order  to  treat  these  modules  uniformly  in 
C,  we  wrap  their  interaction  semantics  by  existentially  quantifying  over  the 
core  state  types  of  each  module,  an  operation  we  encapsulate  in  the  type 
Modsem. 


Record  Modsem  :  Type  = 
mkModsem  { 

F,  V,  C  :  Type; 
ge  :  Genv  F  V; 

sem  :  Semantics  (Genv  F  C)  C  mem 

} 

In  this  dependently  typed  record,  the  types  of  ge  and  sem  depend  on  F,  V, 
and  C.  This  module  is  written  in  programming  language  F  ( e.g .,  Clight  or 
x86),  whose  global  variables  have  type-specification  language  V  (e.g.,  Clight 
types  or  unit);  and  whose  core  states  have  type  C  (e.g.,  Clight  nonaddress- 
able  locals  and  control  stack,  or  x86  register  bank).  We  also  existentially 
bind  the  global  environment  ge  that  was  statically  initialized  for  this  mod¬ 
ule.  It  maps  addresses  to  global  variables  and  function-bodies  (and  global 
identifiers  to  the  addresses  at  which  they  are  defined).  All  the  inputs  to  C 
must  have  ge  functions  that  map  exactly  the  same  global  addresses  (mod¬ 
ules  that  fail  to  declare  some  unused  external  global  variables  or  functions 
can  always  be  made  to  do  so,  by  safety  monotonicity). 

The  final  component  is  sem,  an  interaction  semantics.  It  defines  the 
interface  functions  initiaLcore,  at_external,  after_external,  and  halted,  as  well 
as  a  step  relation  ge  L  c,m  t — >  c! ,  m' .  Modules  in  the  same  language  will 
typically  have  identical  •  L  •  t — >  •  relations,  specialized  by  different  ge 
components  that  map  disjoint  sets  of  addresses  to  internal  function  bodies 
(as  opposed  to  external  function  declarations).  In  what  follows,  we  use  [•] 
to  refer  interchangeably  to  the  interaction  semantics  of  modules  and  their 
Modsem  wrappers. 
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The  output  of  jC  is  an  interaction  semantics  in  the  LinkedState  "language." 
LinkedState  is  parameterized  by  modules,  a  map  from  module  indices  in  the 
range  [0,  N)  to  module  semantics,  where  N  is  the  (nonzero)  number  of 
translation  units  in  the  program. 

Record  Core  (N  :  pos)  ( modules  :  In  — >■  Modsem)  = 
mkCore  { 
idx  :  In', 

core  :  C  ( modules  idx) 

} 

Core  models  the  runtime  state  of  a  sequential  execution  thread.  In  is  the 
(dependent)  type  of  integers  in  range  [0,  N).  The  idx  of  a  Core  is  the  index  of 
the  module  from  which  the  core  was  initialized.  The  core  field  of  the  record 
(of  dependent  type  C  ( modules  idx)  of  core  states  of  module  idx)  gives  the 
current  runtime  state  of  this  particular  core. 

The  runtime  state  of  a  linked  program  is  then: 

Record  LinkedState  ( N  :  pos)  ( modules  :  In  — >  Modsem)  = 
mkLinkedState  { 
pit  :  ident  — *  option  In', 
stack  :  Stack  (Core  N  modules) 

} 

The  two  fields  of  LinkedState  are:  the  procedure  linkage  table  pit — mapping 
function  names  (type  ident)  to  the  indices  of  the  modules  in  which  the 
functions  are  defined,  if  any  (option  In) — and  a  stack  of  cores.  We  model 
the  pit  as  a  field  in  the  LinkedState  record,  as  opposed  to  deriving  it  from  N 
and  modules,  to  retain  flexibility  to  do  dynamic  linking  in  the  future.  The 
stack  is  always  nonempty;  all  cores  except  the  topmost  one  are  at  external 
(Vc  £  (pop  stack).  at_external  c  =  Some  — ). 

Figure  4.2  gives  the  step  relation.  There  are  three  rules. 

The  Step  rule  deals  with  the  case  in  which  the  topmost  core  on  the  call 
stack  (c  =  peek  /.stack)  takes  a  normal  internal  step  ( gec  k  c,m  \ — > 
c' ,  m').  gec  is  the  global  environment  associated  with  the  module  from 
which  c  was  initialized.2 


2This  gec  need  not  have  the  same  type  as  the  linked-program  ge,  or  that  of  the 
global  environments  of  other  modules.  Since  each  module  may  be  implemented 
in  a  different  language,  each  ge^c  d  }  will  in  general  map  function  addresses  to 
function  bodies  of  different  types.  We  do  require  that  the  individual  ge s  have  equal 
domain  (map  the  same  set  of  global  addresses).  Section  7.3.1  of  Chapter  7,  in  which 
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Step 


Stack 

Growth 


co 

c 

m 

_ 

gec  I ~  c,m  t — >  c! ,  m' 


Call 


Stack 

Growth 


at.external  c  = 

Some  ( idf ,  T?) 

I- Pit  idf  =  Some  idx 
ge-idx  *df  =  Some  bf 
initiaLcore 

9eidx  (modules  idx) 
(Vptr  (bf/  0))  if 
=  Some  c' 


Return 


Stack 

Growth 


halted  c'  =  Some  v 
after.external  c  v 
=  Some  c" 


Figure  4.1:  Cases  of  the  linking  corestep  relation.  Gray  dashed  boxes  are 
"stacks-of-cores."  Side  conditions  for  each  of  the  three  rules  are  on  the  right. 
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ge  \=  c,m  !==>-  c! ,  m' 


c  =  peek  /.stack  gec  1=  c,m  t — >  c' ,  m' 
ge  1=  l,m  !==>-  /  with  {stack  :=  push  c'  (pop  /.stack)}, m! 


(Step) 


c  =  peek  /.stack  at_external  c  =  Some  ( idf  ,v ) 

/.pit  idj  =  Some  idx  ge-idx  idf  =  Some  bf 
initiaLcore  ge^x  ( modules  idx)  (Vptr  (bf, 0))  57*  =  Some  c! 

ge  1=  l,m  i=>  /  with  {stack  :=  push  c'  /.stack},  m 


(Call) 


size  /.stack  >1  c  =  peek  /.stack 
halted  c  =  Some  v  c'  =  peek  (pop  /.stack) 
after_external  (Some  v)  c!  =  Some  c" 

ge  1=  l,m  /  with  {stack  :=  push  c"  (pop  (pop  /.stack))},  m 

(Return) 


Figure  4.2:  Corestep  relation  of  Linking  Semantics  C 

In  this  case,  we  just  propagate  the  new  core  state  c'  and  memory 
m'  to  the  result  state  of  the  overall  linking  judgment.  The  notation 
/  with  {stack  :=  push  c!  (pop  /.stack)}  updates  the  topmost  core  state 
on  the  stack.  For  readability,  we  elide  the  operations  required  to  prop¬ 
agate  the  idx  field  of  Core  records. 

The  second  rule.  Call,  handles  the  case  in  which  the  topmost  core  on 
the  stack  is  at_external  (at_external  c  =  Some  ( idf ,  v))  making  a  cross¬ 
module  function  call.  In  this  case,  we  use  the  initiaLcore  function  of  the 
module  semantics  that  defines  function  idf  (/.pit  idf  =  Some  idx)  to 
initialize  a  new  core  state  (in  the  global  environment  gel(jx  associated 
with  modules  idx)  to  handle  the  function  call: 

initiaLcore  gel([x  ( modules  idx)  (Vptr  (bf,  0))  57*  =  Some  c' 

The  core  c'  is  then  pushed  onto  the  stack  (/  with  {stack  :=  push  c'  /.stack}) 
to  become  the  new  running  core. 

our  safety  proofs  for  linking  semantics  impose  a  stronger  correspondence  among 
the  individual  ge s,  tightens  this  requirement,  while  also  presenting  additional 
justification. 
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The  Return  rule  models  external  function  returns.  In  this  case,  the  core 
state  c  is  halted  with  return  value  v  (/  with  {stack  :=  push  c'  /.stack}). 
To  resume  execution,  we  use  the  after  external  function  exposed  by  the 
caller's  semantics  c'  =  peek  (pop  /.stack)  to  inject  the  return  value  v 
(after.external  (Some  v)  c'  =  Some  c ").  State  c  is  then  popped  from  the 
stack,  and  c!  is  updated  to  c"\ 

l  with  {stack  :=  push  c"  (pop  (pop  /.stack))} 

The  stack  is  an  abstraction  of  the  activation-record  stack  of  a  C  or  assem¬ 
bly  program.  Internal  calls  (within  one  module)  do  not  push  on  our  stack; 
they  transition  from  one  core  (and  memory)  to  another  core  (and  memory) 
within  the  same  top  stack  element.  But  of  course  this  core /memory  may  be 
the  abstraction /implementation  of  pushing  and  popping  (module-local)  ac¬ 
tivation  records.  Different  modules  may  or  may  not  share  a  "real"  activation 
stack. 

By  case  analysis  on  l==>,  we  get  that  if  all  the  modules  semantics  are 
deterministic,  then  so  is  the  linked  semantics  C. 

Theorem  3  (^-Determinism). 

(VM  G  {Mq, ..., Mjv-i}.  deterministic  M)  =^>  deterministic  jC(Mq,  ..., Mjv-i) 
Proof.  In  Coq.  3  □ 

jC's  final  ingredient  is  the  definition  of  the  interface  functions:  initiaLcore, 
at.external,  after.external,  and  halted.  In  order  to  reduce  the  number  of  ex¬ 
plicit  case  analyses  and  to  aid  comprehension,  I  present  the  code  in  monadic 
style. 

A  linking  semantics  is  initialized  (initiaLcore)  by  spawning  a  new  core 
to  handle  the  entry  point  function  that  was  called. 

1  initiaLcore  (ge  :  G)  (v  :  V)  (it  :  list  V)  = 

2  do  {  Vptr  (bf, 0)  <—  v; 

3  idf  <—  invertSymbol  ge  bf, 

4  idx  <—  pit  idf 

5  c  i —  initiaLcore  ( modules  idx )  (Vptr  (bf, 0))  v; 

6  return  (Some  (mkLinkedState  pit  (singletonStack  c)))  } 

First,  we  look  up  the  identifier  idf  associated  with  function  pointer  bf,  if  any 
(line  3).  Then,  we  determine  the  index  idx  of  the  module  that  defines  idf 


Temma  linking.det  in  file  compcomp/linking/compcert.linking .  v . 
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Semantics  G  (LinkedState  N  modules )  mem 


initiaLcore  (ge  :  G)  (v  :  V)  (T?  :  list  V)  = 
do  {  Vptr  (bf, 0)  <—  v; 

idf  <—  invertSymbol  ge  bf\ 

idx  <—  pit  idf ; 

c  initiaLcore  ( modules  idx )  (Vptr  (bf,0))  v ; 
return  (Some  (mkLinkedState  pit  (singletonStack  c)))  } 

at.external  (/  :  LinkedState  N  modules )  :  option  (J7  x  list  V)  = 
let  c  =  peek  /.stack  in 
do  {  (idf,  Tj)  at.external  c; 
case  l. pit  idf  of 

None  — >  return  (Some  (idf,  TJ’)) 

Some  _  — >  return  None  } 

after_external  (vopt  :  option  V)  (/  :  LinkedState  N  modules ) 

:  option  (LinkedState  N  modules)  = 

let  c  =  peek  /.stack  in 

do  {  c!  after.external  vopt  c; 

return  (Some  /  with  {stack  =  push  c!  (pop  /.stack)})  } 

halted  (/  :  LinkedState  N  modules)  :  option  V  = 
let  c  =  peek  /.stack  in 
do  {  v  halted  c; 

if  size  /.stack  =  1  then  return  (Some  v) 
else  return  None  } 

corestep  =  (=>• 

Figure  4.3:  Interaction  semantics  of  program  linking.  G  is  the  type 
Genv  unit  unit.  Corestep  relation  is  as  defined  in  Figure  4.2. 
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(line  4)  and  initialize  module  idx  with  function  pointer  Vptr  (bj,0)  and  ar¬ 
guments  T?  (line  5),  producing  initial  core  c.  We  return  the  new  Linked  State 
(mkLinkeState  pit  (singletonStack  c)).  The  pit  used  in  line  4  is  a  parameter  to 
the  definition,  which  is  then  stored  in  the  linker  state.4  I  also  assume  the  ex¬ 
istence  of  a  global  environment  ge  :  Genv  unit  unit  with  the  same  address  set 
as  the  gei  of  each  module  in  modules.  Such  a  ge  can  always  be  constructed 
(by  mapping  global  addresses  to  unit)  whenever  the  gei  are  consistent  (the 
linking  semantics  is  undefined  otherwise). 

A  linking  semantics  is  at  external  when  the  topmost  core  on  the  stack  is 
at  external,  calling  a  function  defined  by  none  of  the  modules  (otherwise, 
we  would  have  initialized  and  pushed  a  new  core  to  handle  the  function). 

at.external  (/  :  LinkedState  N  modules)  :  option  (IF  x  list  V)  = 
let  c  =  peek  /.stack  in 
do  {  ( idf ,  T?)  <—  at.external  c; 
case  /.pit  idf  of 

None  — »  return  (Some  (idf,  v)) 

Some  _  — >  return  None  } 

Line  3  peeks  the  top  core  on  the  stack.  (The  "callstack  nonempty"  invariant 
maintained  by  linking  semantics  ensures  that  such  a  core  always  exists.) 

To  inject  a  return  value  into  linker  states  (after  external),  we  inject  the 
value  into  the  topmost  core  state  on  the  stack  (after.external  vopt  c  =  Some  c' , 
line  4). 


after_external  (vopt  ■  option  V)  (/  :  LinkedState  N  modules) 

:  option  (LinkedState  N  modules)  = 

let  c  =  peek  /.stack  in 

do  {  c'  <—  after_external  vopt  c; 

return  (Some  /  with  {stack  =  push  c!  (pop  /.stack)})  } 

The  LinkedState  we  return  in  this  case  is  the  same  as  /  but  with  the  topmost 
core  c  replaced  by  c' . 

Finally,  a  linking  semantics  is  halted  when  the  stack  contains  a  singleton 
halted  core  state  (halted  c  and  size  /.stack  =  1),  i.e.,  the  topmost  core  is  halted 
and  has  no  return  context. 


4Storing  the  PLT  in  LinkedState  makes  it  possible  to  model  operations  that 
change  the  PLT,  like  dynamic  linking. 
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halted  (/  :  LinkedState  N  modules)  :  option  V  = 
let  c  =  peek  /.stack  in 
do  {  v  <—  halted  c; 

if  size  /.stack  =  1  then  return  (Some  v) 
else  return  None  } 

4.2  Contextual  Equivalence 

Linking  semantics  leads  to  a  natural  notion  of  semantic  context:  Take  pro¬ 
gram  contexts  C  to  be  arbitrary  module  semantics  Modsem.  Then  the  ap¬ 
plication  of  a  program  context  to  an  (open)  multimodule  program  P  is 
just  the  semantics  that  results  from  linking  the  program  with  that  context: 

C(C,\P\). 

We  can  then  define  contextual  equivalence  of  two  open  multimodule 
programs  Ps  and  Ty  as  equitermination  (halted  in  interaction  semantics)  in 
all  contexts: 

Definition  4  (Contextual  Equivalence). 

Ps~Pt  =  VC.£(C,[Ps])t  <=>  £(C,[PT])f 

Pi),  is  termination  of  program  P.  The  context  C  observes  the  state  of  mem¬ 
ory  (and  the  arguments  to  external  calls)  when  the  program  interacts  with 
the  environment.  To  distinguish  P$  and  Pt,  C  can,  e.g.,  get  stuck  (as  op¬ 
posed  to  safely  terminating)  at  one  of  these  interaction  points  if  the  memory 
state  and  arguments  fail  to  satisfy  a  particular  predicate. 

The  above  definition  plays  a  bit  fast  and  loose  with  the  initial  arguments 
and  memory  states  in  which  the  two  programs  Ps  and  I>r  are  executed; 
these  details  will  be  made  precise  when  we  present  reach-closed  contextual 
equivalence  in  Section  6.2.1. 


4.3  Gallina  Contexts 

This  notion  of  context-as-interaction-semantics  is  quite  general:  it  supports 
the  definition  of  program  contexts  in  arbitrary  languages,  e.g.,  Clight  and 
x86,  but  also  Coq's  Gallina.  As  an  example  Gallina  context,  consider  the 
following  Gallina  semantics  (cf.  Section  3.2.3)  that  enforces  the  protocol: 

The  character  argument  c  to  external  function  put  char  satisfies 
the  predicate  isLowerAlpha:  'a'<=  c  &&  c  <=  'z'. 
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Recall  that  Gallina  semantics  are  parameterized  by  a  relation  R  of  type 
7/(37*  :  list  V)(mPre  :  M)(mpost  ■  M).  Prop.  The  corestep  relation  of  a  Gallina 
semantics  is  defined  only  when  this  R  is  satisfied: 

corestep  (ge  :  G)  (c  :  gallinaState)  (m  :  M )  (V  :  gallinaState)  (m'  :  M)  = 
(37?  :  gallinaRel.  33^.  c  =  Some  (7?,  37*)  A  7?  37*  m  m')  A  c'  =  None 

To  "check"  that  is  Lower  Alpha  holds  at  each  call  to  put  char,  we  therefore 
define  R  as  follows: 

Ri^V ,  Tflpre,  Tflpost )  = 

3c.  v  =  (c  ::  nil)  A  isLowerAlpha(c)  A  mpost  =  mpre 

The  relation  R  is  undefined  (and  the  corestep  relation  of  the  Gallina  context 
stuck)  when  either:  (1)  the  arguments  v  to  putchar  are  not  of  shape  (c  :: 
nil),  or  (2)  c  does  not  satisfy  isLowerAlpha. 

4.4  Stateful  Contexts 

The  isLowerAlpha  protocol  above  is  stateless,  in  the  sense  that  it  does 
not  predicate  over  the  history  of  interactions  up  to  a  certain  point.  It  is  also 
possible  to  define  stateful  Gallina  contexts  that  do  observe  properties  of  the 
program  trace. 

For  example,  imagine  we  would  like  to  enforce  the  protocol: 

In  every  execution  of  the  program,  a  particular  external  function 
f  is  always  called  before  a  second  external  function  g. 

Perhaps  f  is  an  initialization  routine,  or  provides  access  to  a  particular 
resource  ( e.g .,  a  file),  while  g  accesses  this  resource. 

Why  should  this  specification  be  preserved  by  the  compiler?  Recall  that, 
while  the  compiler  is  allowed  to  reorder  calls  to  internal  functions — as  long 
as  such  reorderings  are  justified  semantically — it  is  never  allowed  to  reorder 
calls  to  external  functions,  since  such  calls  are  observable  in  interaction 
semantics.  The  order  in  which  calls  to  external  functions  are  made  must  be 
preserved. 

We  can  construct  a  context  that  observes  the  order  in  which  f  and  g 
are  called  as  follows.  First,  extend  the  Gallina  semantics  of  Section  3.2.3  to 
predicate  over  the  global  environment  and  function  pointer  of  the  called 
external  function,  in  addition  to  the  arguments  and  pre-  and  postmemories. 
That  is,  R  is  now  a  relation  of  type: 

R  ■  V(ge  :  G)(vf  :  V)(37*  :  list  V)(mpre  :  M)(mpost  :  M),  Prop 
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with  ge  :  G  the  global  environment  and  vr  the  additional  value  parameter. 
G  is  the  type  Genv  unit  unit  (in  which  function  bodies  and  variable  type 
annotations  are  of  type  unit).  The  global  environment  will  be  used  to  look  up 
the  memory  location  associated  with  a  ghost  global  variable,  idf,ist,  storing 
context  state  (explained  below;  initialized  at  program  startup  to  0). 

In  order  to  case  analyze  in  R  on  which  external  function  is  called  at  each 
interaction  point,  we  update  the  initiaLcore  function  of  the  Gallina  context 
to  store  the  Vf  :  V,  in  addition  to  R  and  v: 

initiaLcore  (R  :  gallinaRel)(#e  :  G )(vf  :  V)(T  :  list  V)  :  option  gallinaState 
=  Some  ( R ,  Vf,  v ) 

with  the  type  gallinaState  appropriately  extended  to: 

gallinaState  =  option  (gallinaRel  x  V  x  list  V) 

The  corestep  relation  must  be  updated  as  well,  to  pass  the  Vf  to  R. 

Now  we  define  R  as  follows: 

l  R(ge,  Vf,  v,  mpre,  mpost )  = 
do  {  Ihist  't  Sl^i^dhist) ’ 

3  Vint  n  <-  mPre(lhist)\ 

Vptr  (bf, 0)  Vf ; 

5  idf  <—  invertSymbol  ge  bf ; 

6  if  n  ——  0  then 

7  if  idf  ——  f  then  return  mpost  =  mpre[lhist  •— >  Vint  1] 

8  else  if  idf  ——  g  then  return  False 

9  else  return  mpost  =  mpre 

10  else  return  mpost  =  mpre  } 

The  code  is  presented  in  monadic  style.  Operations  that  fail  do  so  by  re¬ 
turning  False.  For  example,  the  code  on  Line  2  desugars  to: 

case  ge(idhist )  of  None  — >  False  |  Some  — »  ... 

In  lines  2  through  5,  we  look  up  the  location  Ihist  associated  with  iden¬ 
tifier  idhist,  read  the  value  Vint  n  at  that  address,  case  analyze  the  value  Vf, 
returning  a  pointer  Vptr  (bf, 0),  and  do  a  reverse  lookup  in  the  ge  for  the 
identifier  idf  associated  with  block  bf.  If  any  of  these  operations  fails,  R 
evaluates  to  False. 

Line  6  branches  on  the  value  of  n.  When  n  =  0  (the  initial  state  at 
program  startup),  we  do  an  inner  case  analysis  on  the  identifier  idf.  In 
the  expected  case,  in  which  idf  =  f,  we  change  state  by  asserting  that  the 
postmemory  mpost  equals  the  prememory  with  location  l}usL  updated  to  the 
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value  Vint  1  (rnpre [Ihist  H *  Vint  1]).  If  zcfy  equals  g  when  n  =  0,  the  program 
has  violated  the  policy,  in  which  case  we  return  False.  When  idf  is  neither  f 
nor  g,  or  n  is  no  longer  0,  we  assert  that  the  memory  is  unchanged. 


Discussion.  Whether  this  relation  adequately  models  the  specification 
presented  informally  above  depends  on  a  number  of  factors.  The  most  im¬ 
portant  is  the  use  of  memory  to  record  the  integer  n.  Because  the  entire 
memory  is  communicated  between  modules  at  each  intermodule  function 
call,  linking  semantics  does  not  directly  prevent  a  function  in  one  module 
from  overwriting  the  location  l^t  used  to  store  the  context  state.  It  is  pos¬ 
sible,  however,  to  prove  that  such  overwrites  do  not  occur,  e.g.,  by  proving 
a  global  invariant  on  the  value  in  memory  at  lhist,  or  by  showing  that  the 
program  is  safe  when  executed  in  an  initial  state  that  does  not  contain  valid 
memory  at  location  lhist  (the  location  can  be  initialized  with  read-only  or 
even  empty  permission,  for  example,  causing  writes  to  get  stuck).  Because 
the  context  is  "implemented"  as  a  Gallina  relation,  it  can  update  the  value 
at  l^st  regardless  of  the  permissions,  by  bypassing  the  memory  model  inter¬ 
face.  The  other  modules  in  the  program  must  be  implemented  in  languages 
(e.g.,  Clight  or  x86)  that  respect  the  CompCert  permission  model. 

There  is  a  second  way  in  which  to  model  the  f-before-g  specification  that 
bypasses  the  memory  issues,  at  the  cost  of  increased  complexity  in  linking 
semantics.  This  is  to  directly  record  module-local  state  (nonaddressed  file- 
scope  static  variables  in  C),  in  the  form  of  a  finite  map 

stateType  :  7/v  — >■  Type 

mapping  module  indices  to  the  type  of  auxiliary  state  used  by  each  module, 
and  a  second  map 


moduleStates  :  \/idx  :  1^.  stateType  idx 

recording  the  current  state  associated  with  each  module.  As  opposed  to 
core  states,  which  are  initialized  at  each  module  entry,  module-local  states 
would  persist  across  multiple  dynamic  invocations  of  each  module.  For 
example,  the  Call  rule  of  Figure  4.2  would  be  updated  to: 


(c, tv)  =  peek  /.stack  at.external  c  =  Some  (idf,  v) 
/.pit  idf  =  Some  idx  ge  idf  =  Some  bf 
initiaLcore  (modules  idx)  (Vptr  (bf, 0))  T7*  =  Some  c! 
moduleStates  idx  =  a / 


ge  1=  /,  m  l==>  /  with  {stack  :=  push  (c',co')  /.stack},  m 


(Call') 
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to  pair  the  module-local  state  a /  (as  stored  in  moduleStates  the  previous 
time  module  idx  returned  to  its  caller)  with  the  new  core  state  c!  initialized 
to  handle  function  idj.  The  map  modules  :  In  — »  Modsem  would  also  have 
to  be  updated,  to  contain  interaction  semantics  that  operate  on  pairs  of 
core  states  and  module-local  states.  Returning  to  a  caller  would  involve  an 
update  to  the  moduleStates  map,  at  the  caller's  module  index. 


Chapter 


5  - 

Compiler  Correctness 


Interaction  and  linking  semantics  (Chapters  3  and  4)  provide  the  machin¬ 
ery  with  which  to  state  compiler  correctness  (as  cross-language  contextual 
equivalence;  cf.  Section  4.2  for  the  basic  definition).  We  do  not  yet  have  a 
way  to  establish  such  equivalences,  however. 

This  chapter  lays  the  groundwork.  First,  I  present  whole-program  simula¬ 
tions,  which  recapitulate  standard  forward  simulation  proofs  for  closed  pro¬ 
grams,  but  adapted  to  the  interaction  semantics  of  Chapter  3.  In  Chapter  6 
I  will  establish  whole-program  simulation  for  open  programs  by  linking 
with  a  closing  context,  depending  on  the  results  here. 

The  second  half  of  this  chapter  introduces  structured  simulations,  a  new 
equivalence  method  for  open  programs.  Structured  simulations  are  the 
compiler  correctness  relations  we  establish  for  each  phase  in  Compositional 
CompCert  (Chapter  8).  As  the  results  of  the  next  chapter  demonstrate,  if 
each  pair  of  modules  5)  and  T,  in  a  multimodule  program  is  related  by 
structured  simulation,  then  the  overall  linked  source  and  target  programs 
jC(S)  and  C{T)  are  contextually  equivalent  (Theorem  7  of  Chapter  6). 

In  contrast  with  standard  forward  simulations  and  the  logical  simula¬ 
tion  relations  of  previous  work  [BSDA14],  the  two  distinguishing  charac¬ 
teristics  of  structured  simulations  are  their  rely-guarantee  and  ownership 
disciplines. 

Rely-Guarantee:  Structured  simulations  impose  a  rely-guarantee  disci¬ 
pline  on  the  interactions  of  program  modules.  The  rely-guarantee 
discipline  ensures  that  module  compilation  preserves  the  same  prop¬ 
erties  that  modules  themselves  assume  about  the  behavior  of  external 
functions  (those  defined  in  other  modules).  This,  in  turn,  makes  it 
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possible  to  implement  external  functions  or  libraries  with  code  that  is 
itself  compiled. 

Ownership:  Structured  simulations  enrich  CompCert's  standard  simula¬ 
tion  relations  with  additional  "ownership"  data,  which  makes  it  pos¬ 
sible  to  distinguish  memory  regions  that  are  reorganized  during  com¬ 
pilation  of  distinct  translation  units.  For  example,  the  portion  of  the 
stack  frame  reserved  for  spilling  during  compilation  of  a  function 
A.f  can  be  distinguished  from  the  spill  region  reserved  for  a  second 
function  B.g,  defined  in  a  distinct  translation  unit  B. 

A  key  insight  of  Ownership  above  is  that  the  invariants  that  apply  to  dis¬ 
tinct  regions  of  memory — such  as  the  regions  reserved  by  the  compiler  for 
A.f' s  and  B.g' s  spilled  locals — are  subjective:  function  A.f  can  write  to  its 
own  spilled  locals  but  not  to  B.g' s,  and  vice  versa  for  B.g  with  respect  to 
A.f' s  spills.  Structured  simulations  deal  with  this  subjectivity  by  imposing 
an  "us  vs.  them"  discipline  on  compiler  correctness  invariants:  Each  struc¬ 
tured  simulation  distinguishes  the  parts  of  the  state  that  it  controls  (the 
"us")  from  the  parts  of  the  state  controlled  by  the  environment  (the  "them"). 
This  discipline  is  reminiscent  of  Ley- Wild  and  Nanevski's  subjective  concur¬ 
rent  separation  logic  [LWN13],  though  here  it  is  applied  to  the  two-program 
invariants  used  to  prove  compiler  correctness. 

Another  ingredient  is  a  "leakage"  protocol,  which  ensures  that  the  views 
of  the  memory  state  imposed  by  the  compiler  invariants  for  different  mod¬ 
ules  remain  consistent.  For  example,  when  A.f  calls  B.g  with  arguments 
v,  A.f' s  compilation  invariant  must  "give  up  exclusive  control"  of  all  the 
memory  regions  reachable  from  v  (i.e.,  following  pointer  chains  rooted  in  v ). 
This  condition  represents  the  guarantee  that,  while  later  compilation  stages 
of  A.f  can  still  reorganize  parts  of  the  state  reachable  from  v  (e.g.,  by  chang¬ 
ing  the  order  in  which  memory  regions  are  allocated),  they  cannot  remove 
these  memory  regions  entirely  (e.g.,  by  dead  code/memory  analysis):  the 
existence  of  the  memory  regions  in  question  has  been  leaked  irrevocably 
to  the  environment.  Similarly,  at  external  function  return  points,  memory 
regions  reachable  from  the  return  value  are  "leaked  in"  to  the  caller's  com¬ 
pilation  invariant — representing  the  rely  that  these  regions  will  never  later 
be  removed  by  compilation  of  the  environment.  Our  language-independent 
linking  semantics  and  contextual  equivalence  proof  ensure  that  these  con¬ 
ditions  are  in  rely-guarantee  relation. 

Interestingly,  this  leakage  protocol  bears  much  in  common  with  the 
system-level  semantics  of  Ghica  and  Tzevelekos  [GT12],  There,  Ghica  and 
Tzevelekos  define  a  game  semantics  for  a  C-like  language  that  avoids  im¬ 
posing  so-called  combinatorial  (i.e.,  syntactic)  restrictions  on  the  moves  of 
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the  environment,  by  applying  what  they  call  "epistemic"  restrictions  in¬ 
stead.  These  epistemic  conditions,  which  parallel  our  leakage  conditions, 
allow  the  environment  to  update  the  state  in  nearly  any  way  as  long  as 
the  updates  are  to  memory  regions  leaked  to  the  environment  during  pre¬ 
vious  interactions  with  the  client  program.  This  leads  to  a  strong  semantic 
notion  of  program  context  similar  to  the  one  I  develop  in  Section  4.2.  While 
Ghica  and  Tzevelekos  were  interested  in  modeling  open  C-like  programs 
and  their  environments,  not  compiler  correctness  in  this  setting,  I  view  the 
coincidence  of  our  leakage  conditions  with  their  system-level  semantics  as 
evidence  of  the  naturalness  of  our  leakage  protocol  (Section  5.3.2). 


5.1  Whole-Program  Simulations 

Simulation  relations  (or  just  simulations),  and  the  related  notion  of  bisimula¬ 
tions,  were  first  used  to  prove  program  equivalence  by  Milner  in  the  early 
1970s  (cf.  [Mil71]).  The  idea  is  to  define  a  relation  R  on  the  states  of  two 
infinite  systems  S  and  T — e.g.,  two  potentially  nonterminating  programs — 
such  that  R(s,t )  implies: 

steps  of  the  first  system  s  i— >  s'  are  matched  by  steps  of  the  second  t  h- >■  t 

the  relation  R(s',  t')  can  be  reestablished  after  each  such  pair  of  steps. 

The  first  system  S  is  generally  called  the  source  system  in  this  thesis,  while 
the  second  system  T  is  the  target,  by  analogy  with  the  source  and  target 
languages  of  a  compiler. 

There  are  many  variations  on  the  basic  idea.  A  bisimulation  is  a  simula¬ 
tion  R  such  that  /A1  is  also  a  simulation.  A  weak  simulation  (or  bisimula¬ 
tion)  is  one  in  which  the  number  of  steps  taken  by  the  two  systems  is  not 
one-to-one:  for  example,  R(s,  t )  and  s  i— »■  s'  may  imply  only  that  t  H t! 
such  that  R(s' ,  t'),  i.e.,  t  takes  one  or  more  steps  to  t'  in  order  to  reestablish 
the  relation. 

A  further  useful  generalization  is  stuttering  simulation,  in  which  multiple 
steps  s  i — s'  in  the  first  system  correspond  to  just  a  single  step  in  the  target 
system.  Stuttering  is  typically  modeled  by  defining  a  well-founded  order  < 
on  states  s,  s'  (i.e.,  such  that  there  are  no  infinite  descending  chains  s  >  s'  > 
s"  >  . . .)  for  which  s'  <  s  holds  at  each  stuttering  step.  The  well-founded 
order  precludes  infinite  source  sequences  msAs"4...  that  do  not 
cause  the  target  to  make  at  least  some  progress.  This  is  useful  for  proving, 
e.g.,  preservation  of  termination  behavior  from  S  to  T.  In  Compositional 
CompCert,  and  in  the  rest  of  this  thesis,  I  will  generally  employ  simulations 
of  the  stuttering  kind,  since  they  present  a  nice  balance  between  expressivity 
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(many  kinds  of  compiler  transformations  can  be  proved  correct  in  this  way) 
and  simplicity. 

Figure  5.1  presents  the  clauses  of  what  I  call  whole-program  simulations } 
adapted  to  the  interaction  semantics  interface  of  Chapter  3.  The  simulations 
are  "whole  program"  because  they  do  not  (yet)  relate  program  modules 
that  make  external  function  calls  (there  are  no  cases  for  at.external  and 
after_external).  These  will  be  dealt  with  in  Section  5.3.2. 

Whole-program  simulations  are  nevertheless  useful.  For  example,  as  we 
will  see  in  Chapter  6,  they  can  be  used  to  relate  the  behaviors  of  source  and 
target  programs  linked  with  a  closing  context,  leading  to  a  proof  method 
for  contextual  equivalence.  They  are  also  simpler  than  the  structured  sim¬ 
ulations  I  will  present  next,  in  Section  5.3.2,  in  the  sense  that  it  is  easier 
to  prove  corollaries  of  whole-program  simulation  such  as  termination  and 
safety  preservation  (Section  5.2). 

There  are  three  main  clauses  in  Figure  5.1.  The  first.  Initial  Core,  relates 
programs  at  initialization.  The  second.  Core  Step,  relates  programs  over  core 
steps.  The  third.  Halted  Core,  relates  programs  at  program  exit.  The  main 
datatypes  are: 

•  the  source/target  interaction  semantics  S  and  T; 

•  the  source/target  global  environments  ges  and  gex', 

•  function  pointers,  arguments,  and  return  values  v,  v\,  vj_,  v\,  and  v^, 

•  core  states  c,  c! ,  d,  and  d'  of  S  and  T  respectively; 

•  CompCert  memory  injections  /  and  fr;  and 

•  a  matching  relation  ( c,m )  ( d,tm )  on  source  and  target  core  states 

and  memories,  indexed  by  the  memory  injection  /  (the  R  relation  of 
the  beginning  of  Section  5.1). 

The  Initial  Core  clause  says:  if  initialization  of  source  semantics  S  succeeds 
when  passed  function  pointer  v,  arguments  v\,  in  global  environment 
ges,  to  produce  a  new  core  state  c  of  the  S  semantics,  then  initializing 
T  to  execute  the  same  function  v,  with  related  arguments  V2  in  related 
global  environment  gej-,  results  in  a  state  d  such  that  ( c,m )  (d,  tm). 

The  arguments  v\,  vj_  and  the  global  environments  ges,  ge?  are  related 
by  the  following  auxiliary  relations  that  parameterize  every  whole- 
program  simulation  structure: 

globalsjnv  ges  ger,  which  relates  the  global  environments  ges  and 
gex ■  This  is  typically  defined  as  dom  ges  =  dom  ge t',  and 

initJnv  /  ges  v\  m  ger  0  tm,  which  specifies  conditions  that  hold  of 
the  initial  arguments  and  memories  to  a  pair  of  programs. 


1File  compcomp/core/closed_simulations . v. 
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Initial  Core 

(1)  globalsJnv  ges  geT  A 

(2)  initiaLcore  S  ges  v  v\  —  Some  c  A 

(3)  init_inv/  ges  v{  m  geT  vi  tm 

— =$>  3d.  (4)  initiaLcore  T  geT  v  V2  =  Some  d  A 
(5)  ( c,m )  rs j  f  ( d ,  tm) 


Core  Step 

(1)  globaisjnv  ges  geT  A 

(2)  ( c,m )  (d,tm)  A 

(3)  ges  L  c,m  i — >  c',m' 

=>  3d'  tm'  /'. 

(4)  ( c'rm ')  ( d',tm ') 

A  (5)  geT  L  ( d,t m)  ^ {d1  ,tm')  V 

(6)  ( geT  L  {d,tm)  1 — ( d',tm ')  A  c'  <  c) 


Halted  Core 

(1)  globaisjnv  ges  geT  A 

(2)  ( c,m )  (d,tm)  A 

(3)  halted  S  c  =  Some  v\ 

= 3v2 ■  (4)  halted  T  d  =  Some  V2  A 

(5)  haltjnv  /  ges  v\  m  geT  V2  tm 

Figure  5.1:  Whole-program  simulations  S  <  T 


In  Compositional  CompCert,  we  specialize  the  i n  it  i  nv  relation  to: 

initJnv/  ges  v\  m  geT  V2  tm  = 

inject  /  m  tm  A  valsJnject/  v\  V2  A  preserves_globals  ges  f  A 
memwalid  tm  A  globals_valid  geT  tm  A  vals_valid  V2  tm 

The  inject  conditions  state  that  the  memories  m,  tm.  and  initial  argu¬ 
ments  v\  and  V2  are  related  by  the  CompCert  injection  f,  as  defined 
in  Chapter  2.  The  preserves_globals  clause  states  that  /  at  least  maps 
the  blocks  in  dom  ges,  and  is  the  identity  mapping  in  this  range  (i.e., 
global  blocks  are  not  removed  by  f,  or  translated  to  new  blocks).  The 
last  three  conditions  (starting  with  mem.valid  tm)  state  that 
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the  target  memory  tm  does  not  contain  pointers  to  invalid  blocks  (re¬ 
call  that,  as  in  Chapter  2,  an  invalid  block  is  one  that  has  not  yet 
been  allocated), 

the  target  global  blocks  are  all  valid  (globa ls_va lid  ge t  bra),  and 

the  initial  target  arguments  do  not  contain  pointers  to  invalid  blocks 
either  (vals_valid  V2  tm). 

These  conditions  are  easily  satisfied  when  a  target  program  is  initial¬ 
ized  in  the  memory  containing,  e.g.,  just  the  globals  for  the  program, 
with  arguments  that  are  either  constants  (such  integers  or  floats)  or 
pointers  to  global  variables  or  functions.  Indeed,  there  is  no  other 
static  data  to  point  to  at  program  startup  (I  am  not  yet  considering 
the  core  initializations  that  occur,  e.g.,  in  linking  semantics,  at  inter¬ 
module  external  calls). 

The  Core  Step  clause  is  a  bit  more  involved:  Assume 

1.  globalsjnv  holds  of  ges  and  ge t, 

2.  ( c,m )  and  ( d,tm )  are  matching  source  and  target  configurations 
indexed  by  injection  /,  and 

3.  ( c,m )  steps  to  ( c',m '). 

Then  there  exist  d! ,  tm',  and  ['  such  that 

4.  ( d ' ,  tm')  matches  (c' ,  m!) ,  and  either 

5.  (d,  tm)  takes  one  or  more  steps  to  (d',  tm'),  or 

6.  ( d,tm )  takes  zero  or  more  steps  to  ( d',tm ')  and  c!  descends 
the  stuttering  order  ( c '  <  c ).  (Alternatively,  one  could  say  that 
d  —  d'  A  tm  =  tm'  in  this  case.) 

The  final  clause.  Halted  Core,  defines  what  it  means  for  the  halted  state  to 
be  preserved:  Assume 

1.  globalsjnv  holds  of  ges  and  ger, 

2.  (c,  m)  and  (d,  tm)  are  matching  configurations,  and 

3.  c  is  halted  with  return  value  v\. 

Then  there  exists  a  V2  such  that  d  is  also  halted,  with  return  value 
V2,  and  v\  and  V2  (along  with  ges,  gex,  m,  and  tm)  satisfy  haltjnv,  a 
predicate  parameter  chosen  by  the  user  who  proved  the  simulation. 
In  Compositional  CompCert,  we  specialize  this  relation  to: 

haltjnv  /  ges  v\  m  ge t  V2  tm  = 

inject  /  m  tm  A  vaLinject/  v\  V2  A  preserves_globals  ges  f 
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The  memories  m  and  trn,  as  well  as  the  return  values  v\  and  V2,  are 
injected  by  /.  In  addition,  /  is  a  superset  of  the  identity  relation  on 
dom  ges  (preserves_globals  ges  /). 


5.2  Corollaries 

There  is  nothing  overly  novel  in  the  results  presented  in  the  previous  sec¬ 
tion,  beyond  the  adaptation  of  standard  forward  simulations  to  the  interac¬ 
tion  semantics  interface  of  Chapter  3.  Why  focus  on  simulations  for  whole 
programs  at  all  then? 

The  results  of  Chapter  6  will  establish  soundness  for  open  programs 
P  by  linking  with  closing  contexts  C  (those  that  do  not  themselves  call 
external  functions;  they  may  call  back  into  P).  Section  6.2  constructs — from 
the  open-program  simulations  on  source  Ps  and  target  I’p  to  be  presented 
in  Section  5.3.2 — a  whole-program  simulation  between  C{C ,  [P^])  and 
C(C ,  [PT])  (source  and  target  open  programs  linked  with  closing  context 
C).  Preservation  of  behaviors  in  context  C  then  follows  (Theorem  7)  from 
corollaries  of  whole-program  simulation  I  present  below. 

5.2.1  Termination 

We  say  that  a  program  configuration  c,  m  terminates  in  global  environment 
ge  if  it  steps,  in  zero  or  more  steps,  to  a  configuration  c' ,  m!  for  which  c'  is 
halted. 


Definition  5  (terminates  ge  ( c,m )). 

3c;  m  .  ge  b  (c,m)  i — »*  ( c  ,m ')  A  zb.  halted  c  =  Some  v 


It  is  not  hard  to  show  that  whole-program  simulation  S  <  T  implies 
preservation  of  termination  from  source  program  S  to  target  T.2  Recall  that 
S  <  T  defines  a  matching  relation  ( c,m )  (d,  tm),  subject  to  the  laws  in 

Figure  5.1. 

Corollary  1  (Termination  Preservation).  Assume  S  <  T.  For  source  con¬ 
figurations  (c,m)  and  target  configurations  ( d,tm ),  if  (c,m)  ( d,tm )  and 

terminates  ges  (c,m),  then  terminates  ge t  (d,tm). 


2The  definitions,  theorems,  and  proofs  in  this  subsection  on  termination  can  be 
found  in  file  compcomp/core/ closed_simulations_lemmas  .  v. 
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Proof.  Assumption  terminates  ge$  ( c,m )  unfolds  to: 

3c' m! .  ges  ( c,m )  i — A  ( c' ,m ')  A  3v.  halted  c'  =  Some  v. 

The  corollary  follows  by  induction  on  the  multistep  relation  ge  h  ( c,m )  i — »* 
( c' ,  in') ,  using  the  Core  Step  and  Halted  Core  cases  of  the  simulation  S  <  T. 

□ 

The  other  direction  of  this  corollary,  reflection  of  termination  behavior 
from  T  to  S,  only  holds  under  additional  assumptions.  In  particular,  the 
target  language  L  r  must  be  deterministic  and  source  configuration  ( c,m ) 
must  be  safe. 

Corollary  2  (Termination  Reflection).  Tor  source  configurations  ( c,m )  and 
target  configurations  ( d ,  tm),  if  (c,  m)  ( d ,  tm ),  terminates  ge t  (d,  tm ),  con¬ 

figuration  (c,  m)  is  safe  in  ges,  and  language  LT  is  deterministic,  then  terminates 

ges  (c,  m). 

Proof  Assumption  terminates  gex  (d,tm)  unfolds  to: 

3d'  tm'  n.  ge t  h  ( d,tm )  i — >n  ( d' ,tm' )  A  3v.  halted  d!  =  Some  v. 

The  corollary  follows  by  well-founded  induction  on  n,  using  the  usual  less- 
than  relation  <  on  the  naturals,  and  relying  on  Lemmas  1  and  2  below.  □ 

Lemma  1  (Split  Multistep).  If 

•  LT  is  deterministic, 

•  9^-T  1“  ( d,tm )  i — >n  { d',tm '), 

•  9eT  1“  ( d,tm )  i — >m  (d" ,  tm"),  and 

•  n  <  m 

then  there  exists  q  such  that 

•  m  =  n  +  q,  and 

•  9^T  I-  ( d',tm ')  i — >q  ( d",tm ") 

Lemma  2  (Match  Cases).  A  state  c  is  halted  if  there  exists  return  value  v  such 
that  halted  c  =  Some  v.  If  (c,  m)  (d,  tm),  then  either 

•  halted  c  A  halted  d;  or 

•  3f'  c' m'.  ges  h  (c,m)  i — ( c',m ')  and  either 

-  ( c',m ')  ( d,tm )  A  halted  c'  A  halted  d;  or 

-  3d'  tm'.  geT  h  ( d,tm )  i — ( d',tm ')  A  ( c',m ')  ( d',tm '). 

Lemma  1  asserts  that,  for  deterministic  languages,  if  we  step  in  n  steps 
for  some  n  from  (d,  tm)  to  (dr,  tm'): 
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(d,tm)  i - >  ( d',tm ') 

and  we  also  step,  in  m  >  n  steps,  from  ( d,tm )  to  (d",  tm"): 


then  the  second  multistep  relation,  from  (d,  tm )  to  ( d ",  tm"),  can  be  decom¬ 
posed  into  two  multisteps  intersecting  at  (dr,  tm'): 


*  ( d',t m')  i - >  (d",tm") 


Lemma  2  is  a  useful  elimination  principle  for  ( c,m )  (d,  tm)  that 

facilitates  reasoning  by  cases. 

Putting  everything  together,  we  get  that  S  <  T  implies  equitermination 
of  matching  states,  under  the  additional  assumptions  required  to  prove 
Corollary  2. 

Corollary  3  (Equitermination).  For  source  configurations  ( c,m )  and  target 
configurations  (d,tm),  if  (c,m)  ( d,tm ),  configuration  ( c,m )  is  safe  in  ge$, 

and  language  Lt  is  deterministic,  then  terminates  ges  ( c,m )  -<=>-  terminates 
geT  (d,  tm). 

Proof  By  Corollaries  1  and  2.  □ 


5.2.2  Safety 


Simulations  S  <  T  also  imply  safety  preservation  from  source  to  target. 
When  I  say  a  configuration  (c,  m)  of  interaction  semantics  S  is  safe,  in  global 
environment  ge,  I  mean,  as  usual,  that  the  configuration  will  never  get  stuck 
(it  may  safely  halt  or  infinite  loop).  In  the  context  of  whole-program  simu¬ 
lations,  the  notion  of  safety  we  care  about  is  that  of  closed  programs  (those 
that  do  not  call  external  functions).  I  generalize  safety  to  open  programs  in 
Chapter  6. 

We  say  a  configuration  (c,  m)  is  safe  in  global  environment  ge  for  n  steps 
if  it  satisfies  the  following  recursive  predicate,  expressed  in  Coq  notation: 

safeN  n  ge  c  m  :  Prop  = 
case  n  of 
I  0  — >  True 


n'  +  1  — t 

case  halted  c  of 


None  — »  3(2  m' .  ge\~  c,m  t — >  c',m'  A  safeN  n'  ge  c' m' 

Some  v  — >  True 
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When  re  is  0,  the  predicate  evaluates  to  True  (we  have  given  up  interrogating 
the  system).  When  re  is  greater  than  0,  there  are  two  cases.  If  c  is  not  halted 
(None),  we  assert  that  there  exist  a  core  state  c'  and  memory  m!  such  that 
(a)  the  systems  steps  from  (c,m)  to  ( c',m ')  (we  make  progress)  and  (b) 
the  new  state  (c',  m!)  is  still  safe,  for  at  least  re  —  1  steps.  When  c  is  halted 
(Some  v ),  safeN  reduces  to  True. 

Definition  6  (Safety).  A  configuration  (c,  m)  is  safe  in  global  environment  ge 
if  it  is  safeN  for  all  n. 


safe  ge  c  m  =  Vre.  safeN  re  ge  c  m 

If  one  views  safeN  as  a  finite  approximation  of  safety  ( i.e .,  for  some 
number  re  steps),  then  safe  is  the  intersection  of  all  such  approximations. 
This  style  of  definition  is  quite  useful  when  doing  proofs  by  induction  (on 
re),  especially  with  respect  to  step-indexed  semantics.  Another,  equivalent, 
definition  is  the  more  standard:  a  configuration  (c,  m)  is  safe  if  any  config¬ 
uration  (c',  m')  it  can  multistep  to  ge  b  (c,  m)  i — >*  (cr,  m ')  is  either  halted 
or  can  take  at  least  one  step. 

Now  we  can  state  what  it  means  for  safety  to  be  preserved  by  S  <  T: 

Corollary  4  (Safety  Preservation3).  If 

•  (c,  m)  ( d ,  tm ), 

•  safe  ges  c  m,  and 

•  Ls  and  LT  are  deterministic 
then  safe  ge t  d  tm. 

Proof,  safe  ge t  d  tm  unfolds  to:  Vre.  safe  ger  n  d  tm.  The  corollary  follows 
by  induction  on  re,  relying  on  Lemmas  2,  3,  4,  and  5.  □ 

Lemma  3  (Safe  Downward).  \/ge  re  n'  c  m.  n'  <  re  A  safeN  ge  re  c  m  =^> 
safeN  ge  n'  c  m. 

Lemma  4  (Safe  Forward).  \/ge  re  n'  c  m.  deterministic  Lg  A  ge  b  (c,  m)  i — >n 

( c',m ')  A  safeN  ge  (re  +  n')  c  m  — =>  safeN  ge  n'  c' m' . 

Lemma  5  (Safe  Backward).  \/ge  re  n'  c  m.  ge  b  (c,  m)  i — >n  (cfm1)  A 
safeN  ge  ( n'  —  re)  c' m'  = safeN  ge  n'  c  m. 

Lemma  3  proves  safeN  is  closed  under  approximation.  Lemma  4  states 
that,  for  deterministic  languages  Lg,  multistepping  from  a  safe  state  results 


3The  Coq  proof  is  in  file  compcomp/core/closed_simulations_lemmas  .  v. 
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in  a  safe  state.  Lemma  5  is  the  backward  analog  of  Lemma  4:  multistepping 
from  ( c,m )  to  a  safe  state  ( c! ,  m ')  implies  that  ( c,m )  is  also  safeN. 

To  prove  termination  preservation,  we  needed  only  that  L  y  was  de¬ 
terministic.  Why  do  we  need  determinism  of  L$  here,  in  order  to  prove 
safety  preservation?  The  answer  has  to  do  with  how  the  definition  of  safety 
is  formulated  in  Definition  6  and  the  auxiliary  safeN.  There,  in  safeN,  we 
state  only  that  there  exist  a  c'  and  m!  such  that  ge  h  c,m  t — >  c',  m!  and 
safeN  n1  ge  c' m' .  This  is  sufficient  for  deterministic  languages  (since  there 
can  be  only  one  such  pair  ( c',m '))  but  it  is  not  quite  strong  enough  to 
capture  safety  in  nondeterministic  languages,  for  which  we  would  like  to 
know  instead  that  (a)  there  exist  such  a  c'  and  rn' ,  but  also  that  (b)  for  all 
such  c'  and  m'  (i.e.,  for  which  ge\~  c,m  i — >  c' ,  m'),  safeN  n'  ge  c'  m'r  along 
the  lines  of: 

safeN'  n  ge  c  m  :  Prop  = 
case  n  of 
|  0  — >  True 

j  n!  +  1  — > 

case  halted  c  of 

|  None  — y  3c'  m! .  ge  L  c,m  \ — >  c',m'  A 

Vc' m' .  ge\~  c,m  i — >  c! ,m'  = =>  safeN'  n'  ge  c!  m! 

|  Some  v  — )>  True 

safe'  ge  c  m  Vn.  safeN'  n  ge  c  m 

Under  this  second  formulation  of  safety,  we  have  that  Corollary  4  holds 
even  if  S  is  nondeterministic.  In  addition,  we  can  prove  that — assuming  S 
is  deterministic — the  first  formulation  of  safety  implies  the  second. 

Lemma  6.  Assume  L$  is  deterministic.  For  all  ge,  c,  m,  safe  ge  c  m  =^> 
safe'  ge  c  m. 

We  use  the  first  formulation,  as  given  in  Definition  6,  because  it  matches 
the  definition  of  safety  used  in  the  Verifiable  C  program  logic  [ADH+14], 
This  definition  is  sufficient  in  Compositional  CompCert  because  all  of  the 
languages  of  the  compiler,  from  Clight  to  CompCert  x86  assembly,  are  de¬ 
terministic.  On  the  other  hand,  it  could  be  useful  in  the  future  to  generalize 
the  definition  of  safeN  used  in  the  Verifiable  C  logic  for  nondeterminism. 

5.2.3  Behavior  Refinement 

There  is  a  third  corollary  of  S  <  T ,  tying  together  both  termination  and 
safety  preservation:  Define  the  behavior  of  a  configuration  by  the  following 
inductive: 
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Inductive  behavior  :  Type  =  Termination  |  Divergence  [  Going_wrong. 

A  (closed)  program  (i.e.,  one  that  does  not  call  any  external  functions)  either 
terminates,  diverges,  or  goes  wrong  (gets  stuck). 

Behavior  refinement  says  that  if  target  configuration  (d,tm)  exhibits 
some  behavior  tb,  then  matching  source  configurations  ( c,m )  will  exhibit 
behaviors  b  that  are  refined  by  tb  {b  >beh  tb). 

Corollary  5  (Behavior  Refinement).  If 

•  ( c,m )  ( d ,  tm), 

•  ( d ,  tm)  has  behavior  tb  in  environment  ger,  and 

•  Ly  is  deterministic 

then  there  exists  behavior  b  such  that 

•  (c,m)  exhibits  behavior  b  in  environment  ges,  and 

•  b  >beh  tb. 

Proof.  Proved  in  Coq.4  □ 

Refinement  of  behaviors  b  >beh  tb  is  defined  as  in  the  following  table: 


Source  behavior. . .  is  refined  by  target  behavior. . . 


Termination 

A  beh 

Termination 

Divergence 

A  beh 

Divergence 

Going.wrong 

A  beh 

Termination,  Divergence,  Going.wrong 

If  the  source  configuration  terminates  or  diverges,  then  so  must  the  target 
configuration.  If  the  source  program  goes  wrong  (gets  stuck),  then  the 
target  may  either  terminate,  diverge,  or  itself  go  wrong. 

The  relation  that  ascribes  behaviors  to  programs  is  given  by: 


In  env.  ge,  (c,  m)  has  behavior. . . 

iff. . . 

Termination 

terminates  ge  ( c,m ) 

Divergence 

forever_steps_or_halted  ge  ( c,m ) 

A  -^terminates  ge  ( c,m ) 

Going.wrong 

^safe  ge  (c,  m) 

To  handle  nondeterministic  languages,  replace  safe  above  with  the  alter¬ 
nate  definition  safe'.  For  deterministic  languages,  forever  steps  or  halted  is 
equivalent  to  safe. 

If  we  know,  in  addition,  that  the  source  configuration  (c,  m)  is  safe,  then 
we  get  an  even  stronger  result,  namely: 


4File  comp  comp/  core  /closecLsimulat  ions  .lemmas  .  v. 
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Corollary  6  (Behavioral  Equivalence).  If 

•  (c,  m )  (d,  tm), 

•  Lt  is  deterministic,  and 

•  ( c,m )  is  sa/e  in  ges 

then  for  all  behaviors  b,  (c,m)  has  behavior  b  in  environment  ges  iff  (d,tm)  has 
behavior  b  in  environment  gex- 

Proof  (=>)  By  Corollaries  3  and  4.  (<^=)  By  Corollary  5.  □ 


5.3  Open  Program  Simulations 

In  this  section,  I  extend  the  whole-program  simulations  S  <  T  of  Section  5.1 
to  open  programs  ( i.e .,  those  that  may  call  external  functions  defined  in  other 
translation  units),  in  the  form  of  structured  simulations  SAT.  Structured 
simulations  are  an  extension  of  the  related  logical  simulation  relations  (LSRs), 
which  were  first  described  in  [BSD A14] .  I  first  give  background  on  LSRs,  as 
motivation,  then  present  structured  simulations. 


5.3.1  Logical  Simulation  Relations 

Logical  simulation  relations  (LSRs)  established  compiler  correctness  by 
showing  that  compilation  preserved  the  protocol  structure  of  the  interaction 
semantics  of  Chapter  3.  They  used  CompCert's  original  match  relations 
with  memory  injections  /,  to  relate  source  and  target  states. 

What  does  "preserving  the  protocol  structure  of  interaction  semantics" 
mean?  Lor  internal  execution  steps,  that  LSRs  followed  CompCert's  stan¬ 
dard  forward  simulation  proofs:  internal  steps  of  the  source  semantics  were 
matched  by  (one  or  more)  internal  steps  of  the  target  semantics,  up  to  stut¬ 
tering  of  the  source.  Lor  external  calls,  LSRs  departed  from  CompCert  by 
asserting  that  related  modules: 

•  called  the  same  function  with  related  arguments;  and 

•  were  receptive,  at  the  point  at  which  external  function  calls  returned, 
to  any  related  values  and  memories  the  environment  might  provide. 
By  receptive,  I  mean  the  equivalence  of  related  modules  could  be 
re-established  at  the  point  of  external  function  call  return  assuming 
related  return  values  and  memories. 

This  last  condition  was  subject  to  a  few  constraints  on  how  memory 
could  evolve  over  the  external  calls,  the  two  most  crucial  of  which  were: 
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1.  in  the  source  execution,  external  calls  did  not  modify  any  memory 
region  the  compiler  wished  to  remove;5  and  that 

2.  in  the  target  execution,  external  calls  did  not  modify  target  memory 
locations  that  did  not  correspond  to  readable  locations  in  the  source 
memory 

Condition  (2),  in  particular,  enabled  the  proof  of  compiler  phases  such  as 
spilling,  which  introduces  new  unreachable  spill  locations  into  a  target 
program's  stack  frames.  A  deficiency  of  CompCert's  simulation  proofs  and 
of  LSRs  was  that  they  assumed  conditions  (1)  and  (2)  at  external  calls,  but 
did  not  prove  that  these  properties  were  preserved  by  compilation. 

Directly  imposing  constraints  (1)  and  (2)  onto  the  simulation  clauses 
for  internal  steps  does  not  work,  however.  A  compiled  function  should  be 
allowed  to  write  to  its  own  spill  locations — just  not  to  those  of  its  caller. 

To  capture  the  difference  in  perspective  between  caller  and  callee,  struc¬ 
tured  simulations  make  three  adjustments  to  the  LSR  framework. 

To  index  the  match  relation  they  use  structured  injections  }i  instead  of 
CompCert's  original  injections  /.  The  additional  structure  in  main¬ 
tains  the  block-level  ownership  information  necessary  to  tell  a  callee's 
blocks  (or  other  blocks  associated  with  the  environment)  apart  from 
blocks  associated  with  the  caller. 

Structured  simulations  decorate  the  internal  step  relation  of  interaction 
semantics  with  modification  effects  E  such  that  locations  not  contained 
in  E  are  guaranteed  not  to  be  modified  (i.e.  written  to,  or  freed)  by 
the  step  in  question. 

Structured  simulations  impose  a  restriction  axiom  on  ~  that  ensures  that 
compilation  invariants  depend  only  on  memory  regions  either  allo¬ 
cated  by  the  module  being  compiled,  or  leaked  to  it  via  pointers  re¬ 
turned  from  external  calls.  The  details  are  as  follows. 

5.3.2  Structured  Simulations 

Recall  from  Chapter  2  that,  in  CompCert,  memory  is  allocated  in  regions,  or 
blocks.  Within  each  block,  memory  bytes  are  addressed  using  integer  offsets 


5For  example,  if  a  source-language  variable  is  represented  in  memory  on  the 
stack,  and  in  the  translation  to  intermediate  language  the  compiler  chooses  to  use 
a  register  (unaddressable  local  variable)  instead,  then  I  say  this  memory  region  is 
removed  by  the  compiler. 
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(pointer  arithmetic  is  allowed  only  within  blocks).  CompCert's  memory 
injections 

/  :  block  — »  option  (block  x  Z) 

relate  source  and  target  memories.  For  example,  the  memory  injection  that 
maps  b  to  Some  (b',  z)  associates  source  address  {b,  8)  with  target  address 
(b',8  +  z). 


Structured  injections  p  (Figure  5.2)  strengthen  CompCert's  memory  in¬ 
jection  relations  with  additional  ownership  structure.6  They  have  four  com¬ 
ponents:  Two  ownership  functions  own#, own y  :  block  — »  Ownership,  which 
map  blocks  (in  the  source  and  target  memories,  respectively,  of  a  related 
pair  of  program  states)  to  values  of  an  inductive  Ownership  type;  and  two 
CompCert-style  memory  injections:  fus  and  Wm-  fus  records  the  source- 
target  mapping  of  blocks  that  were  allocated  by  the  current  module;  fthem 
maps  external  blocks  (those  allocated  by  other  modules). 

The  Ownership  modes  are: 


Mode. . . 

applies  to. . . 

Priv 

blocks  (memory  regions)  allocated  by  the  module  being 
compiled  but  which  haven't  been  leaked  to  the  environment 

Pub 

allocated  blocks  that  have  been  leaked  at  a  previous 
interaction  point 

Frgn 

foreign  blocks  leaked  into  p  at  external  calls 

Invis 

blocks  that  have  been  allocated  (by  another  module)  but  not 
leaked  in 

None 

blocks  that  may  not  yet  have  been  allocated. 

A  block  is  (locally)  owned  by  p  in  the  source  or  target  memory  when 
own#(6)  (resp.  owny(6))  is  either  Pub  or  Priv.  External  blocks  in  source  and 
target  are  those  mapped  by  own^#  to  Frgn  or  Invis.  Likewise,  a  block  is 
shared  if  its  ownership  is  either  Pub  or  Frgn.  The  visible  source  blocks  of  p 
are  those  in  the  set  vis#  =  owned#  U  shared#  (and  likewise  for  visy).  I  use 
notation  foreign {#,Ti.  and  publicj#  Tj  to  denote  the  blocks  with  foreign  and 
public  ownership,  respectively. 

We  track  ownership  of  blocks,  rather  than  ownership  byte-by-byte,  be¬ 
cause  the  CompCert  languages  and  memory  model  permit  pointer  arith¬ 
metic  within  blocks.  Once  a  location  within  a  block  has  been  made  public, 
the  whole  block  is  made  public  as  well. 


6File  compcomp/core/structurecLin  jections  .  v. 


84 


CHAPTER  5.  COMPILER  CORRECTNESS 


Structured  Injections 


Ownership  =  Priv  |  Pub  |  Frgn  |  Invis  |  None 

fl  G  Structuredlnjection  :  Type  = 
block  — »  Ownership 
block  — >  Ownership 
block  — >  option  (block  x  Z) 
block  — >  option  (block  x  Z) 

{b  |  ownj(6)  G  { Pub} },  z  G  {S,  T} 
{b  |  own,(&)  G  { Priv} },  i  G  {5,  T} 
{&  |  own i(b)  G  {Frgn}}, i  G  {S,  T} 
{b  |  own,(6)  G  {Invis}},?  G  {S,  T} 
public,  U  private^,  i  G  {S,  T} 
public,  U  foreign,,  i  G  { S ,  T } 
foreign,  U  invis,,  i  G  {S,  T} 
ownedj  U  foreign,,  i  G  {S,  T} 


1 

f  own  $ 

1  owny 

1  ^ 

JS 

1 

(  fthem 

public, 

_A 

private, 

_A 

foreign, 

_A 

invisi 

_A 

owned. 

A_ 

shared. 

A_ 

extern. 

_A 

viSi 

_A 

Structured  Injection  Axioms 

owned,  D  extern,  =  0,  i  G  {S',  T} 

V6i  b2  z.  fus  b\  =  Some  (b2,z)  =>  b\  G  owned^  A  &2  G  owned ^ 

V6i  62  fthem  =  Some  (1)2,  z)  = G  extern^  A  &2  G  extern  y 
V6i-  6i  G  public^  =^>  3^  z.  fUs  h  =  Some  ( b2,z )  A  &2  G  public^ 

V&i-  6i  G  foreign^  =>  =l&2  z.  fthem  &i  =  Some  (1)2,  z)  A  &2  G  foreign  r 
public^  C  ownedy 
foreign  y  C  extern  y 

Injection  Restriction 

f  lx  —  A&.  if  &  G  X  then  f  b  else  None 

FLy  =  p  with  {fus  :=  fus  tx} {fthem  :=  fthem  U} 


Figure  5.2:  Structured  Injections  and  the  axioms  they  satisfy.  In  Coq,  the 
structured  injection  axioms  are  imposed  via  a  dependent  record  type. 
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Complementing  the  data  in  Figure  5.2  are  laws7  that  ensure  proper 
interaction  of  ownership,  leakage,  and  compilation.  These  laws,  given  in 
the  lower  half  of  Figure  5.2,  enforce  that  fus  and  fthem  (1)  operate  exclusively 
on  blocks  of  appropriate  ownership  ( i.e .  fus  only  maps  owned  blocks,  to 
owned  blocks,  and  similarly  for  fthem  and  external  blocks);  and  (2)  are  total 
on  their  portion  of  shared  blocks:  fus  must  map  all  Pubs-  blocks,  and  must 
map  them  to  Puby  blocks,  and  similarly  for  fthem  and  Frgn.  The  result  is  that 
blocks  which  have  been  leaked  to /from  the  environment  in  one  compilation 
stage  cannot  be  removed  by  later  stages. 

At  interaction  points  between  a  module  and  its  environment,  the  struc¬ 
tured  injections  are  adjusted  so  that  (at  these  points)  the  shared  regions  are 
closed  under  pointer  arithmetic  and  dereferencing  (there  are  no  pointers 
from  the  shared  to  the  nonshared  region).  As  an  additional  invariant,  struc¬ 
tured  simulations  maintain  that  the  source  visible  set  viss  is  always  closed 
under  pointer  dereferencing  and  pointer  arithmetic. 


Structured  Simulation  Details.  The  structured  simulation  clauses  for 
initiaLcore,  at_external,  and  halted  are  given  in  Figure  5.3. 8 


The  Initial  Core  clause  states  the  conditions  under  which  core  initializa¬ 
tion  tracks  from  source  to  target.  For  any  source  memory  m,  target 
memory  trri,  and  CompCert-style  memory  injection  /,  and  for  block 
sets  doms  and  domy,  if  c  is  the  core  initialized  by  initiaLcore  to  han¬ 
dle  function  pointer  v  with  arguments  vs,  then  there  exists  a  target 
core  d  that  results  from  initializing  the  target  semantics  at  v  with  the 
related  arguments  vt.  The  other  hypotheses  of  this  clause,  such  as 
those  marked  by  (*),  further  constrain  /,  donas',  and  dorny.  For  exam¬ 
ple,  REACH  tm  (globalsOf  geT  U  blocksOff/)  C  dom^  states  that  the 
set  of  blocks  reachable  from  target  globals  ger  and  v \  is  contained  in 
dom^.  The  hypotheses  marked  (t)  are  required  to  satisfy  a  technical 
invariant  of  structured  simulations,  that  blocks  mentioned  by  the  cur¬ 
rent  structured  injection  were  allocated  at  some  point  in  the  past  (they 
may  have  been  freed  in  the  meantime). 

The  function  }im\t  constructs  a  structured  injection  from  components: 


7File  comp comp/ core /structurecLin ject ions . v. 

8File  compcomp/core/simulations  .  v. 
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Structured  Simulations  SAT 

Initial  Core 

V/  m  tm  v  vs  vt  dom^  domy. 
i n itia  l_core  S  ges  v  vs  =  Some  c  A 
inject  f  m  tm  A 
valsJnject  f  vs  vt  A 
(*)  preserves_globals  ges  f  A 

(*)  (yb\b2Z.  f  b\  =  Some  (b2,  z)  = =>  b\  G  dom^  A  &2  G  dom^jA 
(*)  REACH  tm  (globalsOf  gex  U  blocksOf  vt)  C  domy  A 
(t)  dom^  C  validBlocks  m  A 
(t)  domy  C  validBlocks  tm 
— =$>  3]i  d.  initiaLcore  T  ger  v  vt  =  Some  d 
A  ( c,m )  (d,  tm) 

A  }i  =  ^jnjt(dom,s'/  domr,  REACH  m  (globalsOf  ges  U  blocksOf  vs), 

REACH  tm  (globalsOf  gex  U  blocksOf  vt), /) 

At  External 

( c,m )  ( d,t m)  A  at.external  c  =  Some  ( idf,vs ) 

=^>  3vt-  inject  ]i  m  tm 

A  valsJnject  }i  (viss;<  vs  vt 
A  at.external  d  =  Some  ( idf,vt ) 

A  (  C,  m )  ^|eak_out(f(,  vs,  vtr  m,  tm)  ( d,tm ) 

A  vaisjnject  Ji  (shared,s/(  rn  tm 


Halted  Core 

(c,m)  ( d,t m)  A  halted  c  =  Some  vs 

=^>  3vt-  inject  }i  m  tm  A  vaLinject  }i  (vjSs^  vs  vt 

A  halted  d  =  Some  vt 

Figure  5.3:  Structured  Simulations:  Initial  Core,  At  External,  and  Halted 
Core  clauses 
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dom^  domr  frgn5  frgnr  /  :  Structuredlnjection  = 
mkStructuredlnjection  { 
owrij  =  A  b. 
if  b  G  frgrq  then  Frgn 
else  if  b  £  dom,  then  Invis  else  None; 
fus  =  Ab.  None; 

fthem  =/ 

} 

The  owned| s  functions  are  constructed  from  the  sets  dom^dom  r,fr gn s, 

and  frgn  T.  The  "us"  injection  fus  initially  maps  no  blocks,  because  a 
freshly  initialized  core  has  not  yet  allocated  any  blocks.  The  fthem  injec¬ 
tion  is  set  equal  to  the  argument  injection  /.  The  hypotheses  marked 
(*)  in  the  figure  ensure  that  the  structured  injection  we  build  using 
jt/jnit  satisfies  the  axioms  of  Figure  5.2. 

The  At  External  clause  asserts  that  at  external  can  be  tracked  from  source 
to  target:  source  states  calling  an  external  function  only  match  target 
states  calling  the  same  external  function,  with  related  arguments.  The 
extra  match  clause  in  the  conclusion  of  the  rule, 

(c,  ? 77.)  ^|eak_out(^,  vs,  vt,  m,  tm)  (  d,tm ) 

enforces  that,  at  external  function  call  points,  the  match  relation  ~ 
is  closed  under  the  "leak  out"  operation  defined  later  in  this  chapter. 
In  other  words,  ~  is  not  invalidated  if  we  mark  as  public,  at  external 
call  points,  all  those  blocks  reachable  from  the  arguments  vs  and  v 
I  explain  this  property  in  more  detail  in  the  next  section,  when  I 
introduce  the  rule  for  external  function  call  returns. 

The  final  clause.  Halted  Core,  asserts  preservation  of  termination  behav¬ 
ior.  It  says  that  halted  source  states  only  match  target  states  that  are 
also  halted.  In  addition,  we  get  that  at  termination,  the  source  and 
target  memories  m  and  tm  are  related  by  }L,  and  that  the  return  values 
vs  and  vt  are  related  by  the  restriction  of  }i  to  visible  source  blocks 
\i\sS}i- 


Internal  and  External  Steps.  Figure  5.4  presents  the  two  core  clauses  of 
structured  simulations  A,  those  for  internal  (i.e.  unobservable)  steps  ( In¬ 
ternal  Steps)  and  for  external  interactions  with  the  environment  ( External 
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Structured  Simulations  SAT  (cont'd) 

Internal  Steps 

( c,m )  ~u  (d,  tm)  A  ges  \~  c,m  3 4  c',m'  = =>■ 

3d'  tm '  p1. 

(1)  p  Cus  V’  A 

(2)  separated  p  p' m  tm  A 

(3)  locally  .allocated  p  p' m  tm  in'  tm'  A 

(4)  ( c',m ')  ( d',tm ')  A 

(5)  3Et-  ger  d,  tm  3 4+  d',  tm'  A 

(6)  Es  C  vis.s'  p  =A> 

(6a)  Et  4  visy  p  A 

(6b)  \/bt  zt .  (bt,zt)  CEt  A  owned^^  bt  =  false  =>■ 

3bs  z.fthem(bs)  =  Some  (bt,z)  A  (bSfzt  -  z)  CES 


External  Steps 

(at-external) 

/  ( c,m )  ~}l  (d,  tm)  A 
valsJnject  p  vs  vt  A  inject  p  m  tm  A 
at.external  c  =  Some  (idf,  Vs)  A 
at.external  d  =  Some  (idf,vt)  A 
\  v  =  leak.out  p  vs  vt  m  tm 

(environment) 

Vi/  vs  vt  m'  tm'. 

V  !=them  V 

A  separated  v  v' m  tm  A  injection  .valid  v' m'  tm' 

A  va  I  .inject  v'  vs  vt  A  inject  v' m'  tm' 

A  forward  mm'  A  forward  tm  tm' 

A  unchanged.on  {(b,z)  |  own^t/  b  =  Priv}  m  m' 

A  unchanged  on  (local  out  of  reach  v  m)  tm  tm’ 
A  p'  =  leak.in  v'  vs  vt  m'  tm' 

(after-external) 

= =>  3c' d' .  after.external  vs  c  =  Some  c' 

A  after.external  vt  d  =  Some  d' 

A  ( c',m ')  (d',tvn') 


Figure  5.4:  Structured  Simulations:  Internal  and  External  Step  cases 
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separated  p  p' m  tm  = 

(V&i  &2  -2-  P  h  =  None  =^>  p'  b\  =  Some  (b2,z)  ==^ 
b\  ^  dom gp  A  &2  ^  dom  t  p) 

A  (V6i.  A  i.  domsp  A  b\  G  dom^^'  =^>  — 'valid  m  A) 

A  (V62.  ^  domy^  A  &2  ^  domy;/  =^>  — iva lid  tm  &2) 

locally.allocated  ^  m  tm  m!  tm!  = 

dom^^'  =  doms^  U  fresh  Iocs  ?n  m7 
A  domy^  =  domy^  U  fresh  Iocs  tm  tm7 
A  owned^ p!  =  owned^^  U  freshlocsmm7 
A  owned  x  p'  =  owned  y/r  U  fresh  Iocs  tm  tm7 
A  extern  g  p'  =  extern  g  p 
A  extern  x  p1  =  extern  xp 

local_out_of .reach  p  m  = 

{( b,z )  |  6  £  owned  y  p  A 
V  bo  <5.  fus  p  bo  =  Some  (b,S)  =^> 

max.perm  m  &o  (z  ~  $)  Cperm  Nonempty  V  bo  i  public^  jt/ } 
Figure  5.5:  Structured  simulations:  additional  definitions 


Steps).  The  structure  of  the  internal  diagram  is  familiar  from  traditional  for¬ 
ward  simulation  proofs:  Assume  we  are  in  matching  initial  states  (c,  m) 

( d ,  tm)  and  we  take  a  source  step  geg  h  c,m  A  ,  vn!  with  effect  Eg.  Then 
there  exists  a  matching  d' ,  tm1,  and  Kripke-extended  structured  injection 

zp 

p'  such  that  gex  h  d,  tm  n-G+  d' ,tm'  and  ( c' ,m ')  ( d',tm ').  Clause  (1) 

(Kripke  extension,  p  Cus  p')  says  that  p'  may  map  more  owned  blocks  than 
p  (in  order  to  deal  with  allocations)  but  otherwise  is  equal  to  p.  Clauses  (2) 
and  (3)  are  side  conditions,  the  definitions  of  which  are  given  in  Figure  5.5 
(separated  and  (locally.allocated).  Separated  p  p' m  tm  states,  essentially,  that 
new  regions  mapped  by  p'  but  not  by  p  do  not  correspond  to  regions  that 
already  exist  in  m  or  tin.  Locally.allocated  p  p'  m  tm  m  tm'  states  that  any 
new  blocks  in  p'  (fresh  blocks  allocated  in  this  step)  are  recorded  as  local. 
Clause  (6)  is  the  guarantee  condition: 

•  (6a)  asserts  that  the  target  effects  Ex  are  contained  in  visy  p,  assuming 
that  Eg  C  vis  p.  In  other  words,  the  compiler  preserves  the  property 
of  "writing  and  freeing  only  to  visible  regions." 

•  (6b)  guarantees  that  writes  to  (and  frees  of)  memory  locations  in  the 
target  that  are  not  owned  by  p  (owned  y  p  bt  =  false)  can  be  "tracked 
back"  to  corresponding  writes  and  frees  in  the  source  (3  bs  z.  fthem  ( bs )  = 
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Some  (bt,  z)  and  (b$/  zt  —  z)  G  Eg).  Writes/frees  of  locations  in  blocks 
owned  by  the  module  being  compiled  are  always  permitted,  which 
enables  the  compiler  to  introduce  reloading  code  (for  spilled  vari¬ 
ables)  or  to  add  function  prologue /epilogue  code  that  saves /restores 
callee-save  registers. 

The  Eg  and  Eg  that  appear  in  clause  (5)  and  in  step  judgments  are  effect 

annotations.  For  example,  geg  \~  c,m  i— L  c! ,  m'  means:  configuration  c,  m 
steps  to  d ,  m! ,  writing  to  or  freeing  exactly  the  locations  Eg.  Locations  not 
contained  in  this  set  are  guaranteed  not  to  be  modified.  I  state  these  "does 
not  modify"  guarantees  intensionally  in  this  way,  as  effect  annotations,  in 
order  to  prove  vertical  composition.  The  problem  with  a  more  extensional 
interpretation  of  effects  ( e.g .,  as  input-output  "unchanged  on"  conditions) 
is  that  effects  no  longer  "decompose":  If  a  program  takes  two  steps,  from  m 
to  m"  with  effect  set  E\  and  from  m"  to  m!  with  effect  set  Ej_,  with  overall 
extensional  effect  E,  it  may  be  the  case  that  E\U  Tf  f-  E  if,  for  example, 
the  second  step  restored  a  value  that  was  overwritten  by  the  first  step. 
Decomposition  is  required  to  prove  transitivity  of  structured  simulations, 
as  described  in  Chapter  6. 

The  external  step  diagram  occupies  the  bottom  half  of  Figure  5.4.  It 
relates  an  at  external  source-target  configuration  pair  ( c,m )  (d,  tm) 

with  the  after.external  configuration  pair  ( d,m ')  ~  ,  ( d',tm ')  that  results 
from  making  an  external  call.  The  basic  premise  is:  For  any  source-target 
return  values  vs,  vt,  return  memories  in'  and  tm' ,  and  structured  injection 
v'  satisfying  the  listed  conditions,  it's  possible  to  inject  vs  and  vt  into  states 
c  and  d,  resulting  in  the  new  states  d  and  d'  which  match  in  jir,  in' ,  and  tm' 
(( d,m ')  ~  ,  (d',t™>'))-  The  v  Lthem  v'  is  dual  to  the  Lus  condition  used 
in  the  internal  step  diagram.  It  says  that  v'  may  map  more  external  blocks 
than  v — in  order  to  deal  with  allocations  performed  by  the  environment — 
but  otherwise  is  equal  to  v.  The  other  nonbolded  conditions  are  adapted 
from  CompCert,  and  follow  in  our  Coq  proofs  directly  from  symmetric 
conditions  on  the  match-state  relation  and  the  internal  step  diagram. 

The  conditions  listed  in  bold  together  compose  the  structured  simula¬ 
tion  rely.  The  predicate  unchangecLon  U  m  m'  specifies  that  memories  m 
and  m'  are  equal  (same  contents  and  permissions)  at  the  locations  in  set  U. 
In  the  source  execution,  I  use  unchangecLon  {{b,z)  \  own^  v  b  =  Priv}  m  m' 
to  ensure  that  m  and  in'  are  equal  at  locations  in  the  private  blocks  of 
the  injection  v,  which  is  built  from  }i  by  updating  leakage  information  as 
described  below.  The  target-execution  condition 

unchangecLon  (local_out_of .reach  v  m )  tm  tm' 
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leak-in 


leak-out 


private 


private 


1 

public  « 

I 

foreign 


private 

I 

public  • 

1 

foreign 


Figure  5.6:  Graphical  representation  of  the  structured  injection  leakage 
operations.  The  thick  black  arrows  are  pointers  in  memory.  The  white 
private  and  light  gray  public  boxes  are  owned  ("us")  blocks.  The  dark 
gray  boxes  are  foreign  ("them")  blocks.  The  striped  box  is  an  Invis  memory 
region  that  was  allocated  by  another  module  but  not  yet  leaked  in.  The  leak- 
in  operation  marks  the  reachable  invisible  region  as  foreign.  The  leak-out 
operation  marks  as  public  a  private  region  reachable  from  a  public  pointer. 


says  that  trn  and  tm J  are  equal  at  owned  target  locations  that  either  (1)  do 
not  correspond  to  readable  source  locations,  or  (2)  are  mapped  from  private 
source  locations.  By  using  unchanged.on  here,  I  stipulate  the  nonmodifica¬ 
tion  conditions  of  the  rely  extensionally. 

The  structured  injection  v  is  built  from  }i — the  injection  that  originally 
related  at_external  states  (c,  m)  ~u  (d,tm) — using  the  leak.out  function  de¬ 
picted  graphically  in  Figure  5.6  and  defined  in  Figure  5.7.9  The  idea  is: 
leak.out  "leaks"  to  the  public  (other  modules)  blocks  that  are  reachable 
by  following  pointer  paths  either  from  the  arguments  vt  to  the  external 
call  (blocksOf  Fj)  or  from  blocks  that  were  previously  shared  (shared*  ji). 
This  is  a  consistency  condition:  It  says  that  structured  simulations  may  not 
assume  anything  about  the  contents  of  leaked  blocks  (the  unchanged  on  con¬ 
ditions  that  form  the  rely  satisfied  by  the  environment  apply  only  to  private 
blocks).  The  functions  reach  and  REACH  (as  defined  in  Chapter  2)  calculate 
the  transitive  closure  of  the  points-to  relation  on  CompCert  memories.  In 
the  definition  of  leak.out,  I  use  the  auxiliary  function  export  to  update  the 
ownership  functions  of  }i  to  map  blocks  in  the  reachable  set  to  Pub. 

The  lea k_i n  function  used  to  define  ji'  at  the  end  of  the  external  step 
diagram  plays  a  role  analogous  to  that  of  leak.out,  except  that  here,  we  are 


9File  compcomp/core/reach . v. 
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export,  ji  B  :  Structured  Injection  = 

ji  with  {ownj  :=  Ab.  if  b  G  B  then  Pub  else  own,  ji  b},  i  G  {S,  T } 
import,  ji  B  :  Structuredlnjection  = 

ji  with  {ownj  :=  Ab.  if  b  G  B  then  Frgn  else  own,  ji  b},  i  G  {S,  T} 


leak.out  ji  vs  vt  m  tm  :  Structuredlnjection  = 
let  Lg  =  REACH  m  (blocksOf  vs  U  shared^  }i)  D  owned $  ji 
Lt  —  REACH  tm  (blocksOf  vt  U  shared  t  e)  El  owned t  p 
in  export^  (export^-  ji  Lg)  L t 


leak.in  ji  vs  vt  m  tm  :  Structuredlnjection  = 
let  Lg  =  REACH  m  (blocksOf  [us]  U  shared^/)  Cl  extern^;/ 
Lt  —  REACH  tm  (blocksOf  [vt\  Usharedy^)  Cl  extern  t  }i 
in  import^  (import^  ji  L$)  Lt 

Figure  5.7:  Block  leakage 


leaking  into  ji'  new  foreign  blocks  reachable  from  the  return  value  v%  of 
the  external  call.  Likewise,  the  import  function  is  similar  to  export ,  except 
that  it  updates  the  ownership  functions  of  a  structured  injection  to  map  the 
block  set  B  to  Frgn,  as  opposed  to  Pub. 

Additional  Conditions.  Structured  simulations  impose  two  additional 
consistency  conditions  which  I  have  not  yet  discussed  in  detail:  (1)  the 
simulation  relation  ^  is  closed  under  restriction  of  ji  to  the  visible  source 
blocks  of  ji;]()  and  (2)  whenever  SAT,  the  global  environment  of  target 
module  T  is  consistent  with  the  globals  of  S:  Any  symbol  mapped  by  S' s 
global  environment  is  mapped  to  the  same  address  by  T's  globals  (module 
T  may  declare  additional  globals). 

Restriction,  defined  in  Figure  5.2  as  ji  [x  (with  X  a  block  set),  just  limits 
the  domain  of  ji  to  A.  If  ^  is  closed  under  restriction  to  the  visible  blocks, 
then  it  does  not  distinguish  memories  that  differ  only  at  Invis  (or  None) 
memory  regions.  All  of  the  compiler  invariants  of  Compositional  CompCert 
satisfy  this  property. 


10Restriction  is  in  compcomp/core/structurecLin  jections  .  v.  The  clo¬ 
sure  condition  on  is  in  compcomp/core/simulations  .  v. 
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The  more  general  version  of  restriction,  and  the  one  actually  used  in 
Compositional  CompCert,  is:  ji  is  closed  under  restriction  to  any  reach- 
closed  superset  of  vis^  y,  stated  as  follows: 

\/X  O  vis sji.  REACH_closed  m  X  A  ( c,m )  (d,tm) 

= =>•  (c,m)  (d,tm) 

A  block  set  X  is  REACH.closed  in  memory  m  when  it  contains  its  reach 
closure,  as  calculated  in  m: 

Definition  7  (REACHLclosed). 

REACH_closed  m  X  =  REACH  m  X  C  X 

Although  the  key  motivation  for  restriction  is  the  proof  of  vertical  composi¬ 
tion  for  structured  simulations  (Theorem  4,  Chapter  6),  the  condition  also 
makes  intuitive  sense:  the  simulation  invariant  used  to  prove  correctness 
of  a  particular  compilation  phase  should  be  independent  of  those  blocks 
that  were  allocated  by  other  modules  but  not  leaked  to  the  module  being 
compiled.  This  is  one  of  the  ways  in  which  we  ensure  that  compiler  phases 
can,  e.g.,  remove  In  vis  blocks  (for  example,  during  a  compilation  pass  that 
removes  a  dead  function  call  or  memory  allocation). 


Chapter 


6  - 

Separate  Compilation 


The  structured  simulations  of  the  previous  chapter  compose  both: 

vertically  in  the  sense  that  multiple  structured  simulations,  for  the  distinct 
phases  of  a  compiler,  can  be  composed  end-to-end;  and 

horizontally  in  the  sense  that  module-local  structured  simulations  for  the 
individual  translation  units  of  a  program  can  be  composed  to  build 
an  overall  simulation  relation  for  linked  source  and  target  programs, 
as  expressed  in  the  linking  semantics  of  Chapter  4. 

This  chapter  presents  and  explains  these  two  results.  I  do  not  give  full 
IbTpX proofs  (the  mechanized  proofs  are  available  in  the  Coq  sources  that 
accompany  this  thesis).  But  I  do  describe  the  most  important  invariants  in 
detail. 


6.1  Vertical  Composition 

Realistic  compilers  are  composed  of  multiple  translation  phases.  These 
phases  are  composed  transitively,  or  vertically,  to  yield  a  full  compiler.  For 
example,  at  the  time  this  thesis  was  written,  the  most  recent  release  of  the 
CompCert  compiler  (version  2.4)  included  18  verified  compilation  phases, 
each  of  which  was  proved  correct  indepedently  of  all  the  others.  CompCert 
2.1,  upon  which  Compositional  CompCert  is  based,  included  16  verified 
phases,  also  proved  correct  independently. 

There  are  a  number  of  reasons  why  it  makes  sense  to  structure  a  com¬ 
piler,  whether  verified  or  not,  as  the  transitive  composition  of  a  number  of 
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small  phases.  Decomposing  the  compiler  in  this  way  means  each  transla¬ 
tion  phase  does  less,  simplifying  correctness  invariants.  The  various  phases 
of  the  compiler  can  also  be  used  independently.  For  example,  the  author  of 
a  compiler  for  another  source  language  besides  C,  such  as  Java  or  Haskell, 
could  target  just  the  backend  of  CompCert. 

This  section  presents  the  proof  that  structured  simulations,  as  presented 
in  Chapter  5,  compose  transitively.  The  result  is  not  unexpected — standard 
forward  simulations  are  trivially  transitive,  for  example.  The  proof  for  struc¬ 
tured  simulations  is  complicated  primarily  by  the  external-call  clause  (lower 
half  of  Figure  5.4),  which  requires  the  construction  of  an  interpolating  after¬ 
external  memory  m 2  during  the  transitivity  proof,  in  the  intermediate  ex¬ 
ecution  between  source  and  target.  As  mentioned  in  Chapter  5,  the  proof 
of  transitivity  of  the  internal-step  diagram  is  tightly  dependent  on  our 
treatment  of  effect  annotations. 

Theorem  4  (Transitivity).  Let  L\,  L2,  and  L3  be  effect-annotated  interaction 
semantics.  If  L\  A  L2  is  a  structured  simulation  from  L  \  to  L2  and  L2  A  L3 
a  structured  simulation  from  L2  to  L3,  then  there  exists  a  structured  simulation 
L\  A  L^from  L\  to  L3. 


Proof.  In  the  Coqcode  that  accompanies  this  thesis  (lemma  ef  f_sim_trans 
in  file  compcomp/ core/ simulat ions.trans  .  v).  □ 

The  most  interesting  case  of  the  proof  is  that  for  the  after_external  clause. 
In  order  to  establish  the  (cj,  m'f)  ~  ,  (cf  m'f)  relation  between  the  return 
states  in  languages  L-\  and  L3,  one  would  like  to  appeal  to  the  corresponding 
relations  that  are  inductively  given  for  L\  A  Z,2  and  L2  A  L3 .  However,  in 
order  for  these  induction  hypotheses  to  apply,  we  must  provide  a  suitable 
intermediate  state  (c2,  m2),  and  in  particular  the  memory  m2 .  Figure  6.1 
depicts  this  situation  graphically. 

6.1.1  Interpolation 

As  illustrated  in  the  figure,  we  require  the  existence  of  a  post-call  memory 
m2  in  L2  such  that  m[  can  be  injected  to  m2  (via  an  extension  pi\  of  }i\)  and 
m2  can  be  injected  to  m3  via  y2,  such  that  }i'  =  }i'2  o  ff  (y2  o  is  injection 
composition).  This  is  assuming  fi\  injects  m\  to  m2,  ji2  injects  m2  to  m3,  and 
}i'  injects  m[  to  m  f 

Prior  to  CompCert  2.0,  memory  injections  did  not  compose,  i.e.  inject  (^/2  o 
}i\)  mi  7713  did  not  follow  from  inject  ft]  in\  m2  and  inject  ji2  'm2  m3.  Because 
the  simulations  did  not  expose  memory,  transitive  compiler  correctness 
did  not  require  this  property  to  hold.  In  CompCert  2.0,  Leroy  respecified 
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m\ 


}h 


m2 


}i2 


m3 


forward, 

unchanged.on 


forward, 

unchanged.on 


forward, 

unchanged.on 


Figure  6.1:  Interpolation  lemma  for  composing  injection  phases  L\  A  L2 
and  L2  A  L3.  Solid  lines  represent  assumptions;  dashed  lines  represent 
constraints  that  the  constructed  m 2  has  to  satisfy.  Composition  of  simulation 
proofs  in  this  diagram  is  left-to-right  (contrary  to  my  use  of  the  term  vertical 
for  phase-by-phase  composition  of  simulations). 


injections  to  facilitate  composition,  based  on  a  suggestion  of  Tahina  Ra- 
mananandro.  The  interpolation  lemma  provides  the  counterpart  to  this 
composition,  by  guaranteeing  that  the  post-call  injection  inject  }i'  m[  can 
be  split  into  some  m'2/  }i[,  and  }i2  with  inject  rn[  m2  and  inject  ji'2  m2  mr3. 
Moreover,  these  items  can  be  constructed  in  such  a  way  that  the  evolution 
m2  ^  m'2  inherits  the  appropriate  forward  and  unchangecLon  properties  from 
the  extremal  evolutions  m\  m[  and  m3  m3. 

Our  proofs  of  the  interpolation  lemmas  suggested  a  handful  of  addi¬ 
tional  alterations  to  the  memory  model,  which  we  communicated  to  Leroy. 
These  included  a  subtle  refinement  to  the  treatment  of  permissions  across 
external  calls  and  a  tweak  to  the  definition  of  unchOn.  Leroy  installed  these 
modifications  in  CompCert  2.0,  and  we  formally  validated  the  interpolation 
lemma  in  Coq.  That  is,  we  have  proved  that  intermediate  memories  m2/  and 
injections  }i\  and  }i'2  with  the  required  properties  can  indeed  be  constructed. 


6.2  Horizontal  Composition 

The  second  kind  of  compositionality  is  horizontal:  We  would  like  to  know 
that  composing  the  simulation  relations  established  by  independently  com¬ 
piling  the  modules  in  a  program  results  in  an  overall  simulation  between 
the  (linked)  multimodule  source  and  target  programs.  We  give  the  theorem 
statement  first,  then  explain  some  of  the  subtleties,  in  particular,  the  restric¬ 
tion  to  reach-closed  source  semantics,  which  enforces  the  single-program  con- 
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ditions  corresponding  to  the  structured  simulation  guarantees  of  Chapter  5, 
and  to  valid  target  semantics  (a  technical  property  related  to  the  CompCert 
memory  model,  explained  below). 

Theorem  5  (Linking). 

•  If  Ps  =  So,  Si,  ■  ■  ■  ,  5jv- i  is  a  multimodule  program  with  N  translation 
units,  each  of  which  is  reach-closed, 

•  Ps  is  compiled  to  Pt  =  To,  T\,  ■  ■  ■ ,  7a?_i  (possibly  by  N  different  compi¬ 
lation  functions)  such  that  [5)]  A  [T^J/or  each  source-target  pair, 

•  each  Tf  is  valid,  and 

•  the  global  environments  of  the  Si  (resp.  Tf  have  equal  domain,  then 

•  there  is  a  simulation  relation  ^([[Pg])  <  £({Pt ])  between  the  source  and 
target  programs  that  result  from  linking  the  5)  and  independently  linking 
the  p. 


Proof  In  Coq.1  The  simulation  invariant  is  described  in  Section  6.2.2.  □ 

The  <  in  the  theorem  denotes  forward  simulation  on  whole  programs, 
as  in  Chapter  5.  As  Corollary  7  will  show,  establishing  <  is  sufficient  to 
prove  contextual  equivalence  of  open  multimodule  programs  (by  linking 
with  a  closing  context).  Restricting  to  modules  with  equal  global  domain 
may  seem  counterintuitive;  linking  can,  at  least  in  principle,  enlarge  the 
set  of  global  addresses  that  are  visible  to  any  one  module  in  isolation.  The 
"globals  have  equal  domain"  assumption  defers  this  reasoning  to  the  pro¬ 
gram  logic  (Chapter  7),  in  which  it  is  necessary  either  to  prove  safety  mono¬ 
tonicity  under  global  environment  extension,  or  to  preprocess  modules  to 
propagate  global  declarations  (the  current  VST  strategy). 

A  valid  semantics,  as  in  Section  3.3,  is  one  that  never  stores  invalid  point¬ 
ers  into  memory.  Invalid  pointers,  in  CompCert  parlance,  are  those  that 
refer  to  memory  regions  that  have  not  yet  been  allocated  (freed  pointers 
are  never  invalid).  This  condition  is  true  for  all  contexts  we  care  about  (for 
example,  it  holds  of  all  programs  in  CompCert's  Clight  [Lemma  9]  and  x86 
languages  [Theorem  2],  which  do  not  permit  storing  invalid  addresses  into 
memory). 

But  why  is  it  necessary?  The  answer  is  technical.  In  order  to  establish, 
during  the  proof  of  Theorem  5,  the 

REACH  tm  (globalsOf  geT  U  blocksOfTt)  C  domy 


^he  theorem  statement  is  in  compcomp/linking/linking_spec  .  v.  Theo¬ 
rem  link  in  file  compcomp/linking/linking_proof  .  v  gives  the  proof. 
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condition  of  the  Initial  Core  clause  of  structured  simulations  (Figure  5.3), 
it  is  necessary  to  know  that  the  target  memory  trn  satisfies  mem.valid  at 
each  intermodule  interaction  point.  Otherwise,  we  could  not  show  that  the 
reach-closure  of  the  set  globalsOf  gex  U  blocksOf  vt  is  contained  in  domy1 
(we  instantiate  domy,  in  the  proof,  to  equal  the  set  of  valid  blocks  in  tm ). 
One  could  imagine  a  different  proof  strategy,  in  which  domy  is  instanti¬ 
ated  directly  to  REACH  tm  (globalsOf  ge y  U  blocksOf  ~vt),  for  example.  But 
then  we  fall  afoul  of  a  validity  condition  required  elsewhere  in  structured 
simulations,  that  domy  C  valid  Blocks  tm.  Ultimately,  these  properties  are 
tied  deeply  to  the  specifics  of  CompCert's  memory  model,  such  as  the 
CompCert  allocation  model  (Chapter  2),  and  to  the  specifics  of  structured 
simulations. 

The  restriction  to  reach-closed  semantics  (Section  3.3)  is  best  motivated 
with  an  example.  Consider  the  following  C  program,  a  variation  of  the 
second  of  the  C  example  programs  from  Chapter  1: 

//Module  A 

void  g (int * ) ; 

int  f (void)  { 

int  a;  int  b  =  3; 

g (&a) ; 

return  b; 

} 

Function  A .  f  calls  an  external  function  B  .  g,  passing  &  a  as  argument. 

Now  imagine  we  link  with  the  following  context: 

//Module  B 

void  g(int*  p)  { 

*(int*) ( (uintptr_t ) p  +4)  =4; 

} 

in  which  B  .  g  writes  the  value  4  to  address  &b  =  p+1  by  first  casting  p  to 
an  integer,  adding  4  (the  size  in  bytes  of  integers  on  a  32-bit  machine),  then 
casting  back  to  an  integer  pointer  and  performing  the  write.  In  the  context 
of  this  (implementation-defined)  g,  standard  compiler  optimizations  such 
as  constant  propagation  of  b  in  A .  f  are  unsound,  as  the  discussion  in 
Chapter  1  showed. 

The  point  of  a  compositional  compiler,  however,  is  to  enable  local  mod¬ 
ular  compilation,  which  should  depend  only  on  translation-unit-local  anal¬ 
yses.  Correctness  of  optimizations  like  constant  propagation,  dead-code 
elimination,  and  inlining  should  not  depend  on  the  particulars  of  the  larger 
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program  context  in  which  a  module  is  executed  ( e.g .,  the  implementation  of 
B  .  g),  only  that  the  larger  context  respects  the  C-level  abstractions  assumed 
by  the  compiler. 

The  challenge,  then,  is  coming  up  with  a  characterization  of  the  source 
modules  So,  S\,  -  •  •  ,  SV-i  that  does  admit  linking  as  in  Theorem  5.  We  do 
this  in  general,  for  arbitrary  interaction  semantics,  by  observing  that  the 
write  to  &b  =  (int*)  (  (uintptr_t )  p  +  4)  is  ill-formed  not  because  it 
goes  wrong  (the  write  is  safe  under  certain  interpretations  of  the  behavior  of 
integer-pointer  casts),  but  because  it's  a  write  to  a  location  that  the  context 
B  .  g  shouldn't  have  "known  about"  in  the  first  place. 

Put  another  way,  address  &b  was  not  reachable  via  pointer  arithmetic2 
either  from  g's  initial  arguments  (pointer  arithmetic  across  local  variable 
regions  is  undefined),  from  global  variables,  or  from  the  return  values  of 
external  calls  g  may  have  made.  This  condition — no  writes  or  frees  to  lo¬ 
cations  that  are  not  "visible" — is  the  analogue  of  the  Eg  C  vis gji  in  clause 
(6)  of  Figure  5.4,  but  stated  as  a  single-program  property,  independent  of 
any  particular  structured  injection  }i.  We  formalize  the  notion  of  a  seman¬ 
tics  that  respects  this  characterization  of  visible  locations  as  the  reach-closed 
semantics  of  Chapter  3,  Section  3.3. 

From  the  perspective  of  compiler  correctness  proofs,  the  restriction  to 
reach-closed  contexts  is  what  enables  program  transformations:  It  would  be 
unsound,  for  example,  to  constant-propagate  b  out  of  memory  if  the  larger 
program  context  depended  on  it,  as  in  the  example  program  above. 

6.2.1  Reach-Closed  Contextual  Equivalence 

As  a  corollary  of  Theorem  5,  we  get  a  form  of  contextual  equivalence  when 
the  source  modules  are  reach-closed  and  the  target  modules  are  valid,  stated 
in  terms  of  a  variation  of  Definition  4  in  which  contexts  satisfy  a  few  ad¬ 
ditional  properties.  Informally,  if  each  module  in  multimodule  program 
Pg  is  compiled  to  the  corresponding  module  in  target  Pt,  then  Pg  and 
Pt  have  the  same  behavior  (termination,  divergence)  when  linked  with  a 
well-defined  program  context  C.  C(C ,  Pg)  may  also  go  wrong,  in  which  case 
we  say  nothing  about  the  behavior  of  C(C ,  Pt)- 


2When  the  program  context  is  implemented  in  a  language  like  x86  assembly, 
it  might  seem  strange  to  say  "not  reachable  via  pointer  arithmetic"  since  in  most 
assembly  models  the  entire  address  space  is  "reachable".  Flere  we  mean  "not 
reachable"  in  the  instrumented  semantics  of  x86  assembly  used  by  CompCert, 
in  which  memory  is  allocated  in  blocks,  as  in  CompCert's  Clight,  and  interblock 
pointer  arithmetic  is  disallowed. 
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More  formally,  well-defined  contexts  are  those  deterministic  C  that  self- 
simulate,  and  which  are  both  reach-closed  and  valid. 

Definition  8  (Well-Defined  Contexts). 

welLdefined  C  =  deterministic  C  A  C  A  C  A  reach.closed  C  A  valid  C 

The  C  A  C  condition  says  that  C  commutes  with  memory  injections: 
If  C  is  initialized  twice  with  injected  arguments,  both  executions  either 
go  wrong,  nonterminate,  or  equiterminate  with  injected  results.  Although 
this  condition  follows  directly  from  the  form  of  Theorem  5,  it  is  strongly 
motivated:  We  should  not  allow  contexts  to  distinguish  source  and  target 
programs  based  solely  on  bijective  renamings  of  memory  blocks  exposed 
to  the  context  (pointer  arithmetic  is  not  allowed  between  blocks,  only  within 
blocks).  The  consistency  conditions  on  structured  injections  and  simulations 
that  we  described  in  Chapter  5  mean  that  in  the  proof  of  C  A  C,  the  context 
may  assume  that  all  public  blocks  leaked  by  the  program  are  mapped 
from  source  to  target  (they  are  never  removed  during  compilation  of  the 
program). 

Reach-closed  contextual  equivalence  is  then  just  equitermination,  assum¬ 
ing  the  source  linked  program  is  safe,  in  all  well-defined  contexts: 

Definition  9  (Reach-Closed  Contextual  Equivalence). 

Ps  ^rc  Pt  —  V C  j  m  tin  ge  v  vs  v*t- 

welLdefined  C  A  initJnv  j  ge  vs  m  ge  vt  tm  A  safe  ge  C  Ps  v  vs  m 
= =>  (terminates  ge  C  Ps  v  vs  m  <t=>  terminates  ge  C  Pt  v  v*t  tm) 

Invariant  initJnv  is  defined  as  in  Section  5.1.  Essentially:  injection  j 
relates  m  to  tm  and  vs  to  vr,  and  is  the  identity  on  dom(^e).  Also,  vr  and 
tm  must  not  contain  invalid  pointers  (to  memory  regions  that  have  not  yet 
been  allocated). 

The  global  environment  ge  :  Genv  unit  unit  is  used  solely  to  ensure  that 
the  global  environments  of  the  linked  modules  have  equal  domain;  hence 
we  use  the  same  ge  in  both  source  and  target.3 

Predicates  safe  and  terminates  are  overloaded  to  operate  on  whole  pro¬ 
grams,  instead  of  configurations,  as  follows.  Say  a  program  P  is  initializable 
at  entry  point  v  with  arguments  v  if  initialization  succeeds  for  v  at  v. 

Definition  10  (initializable  ge  C  P  v  v).  3c.  initiaLcore^^jpj)  ge  v  v  =  Some  c 


3Recall  that  the  global  environments  used  to  look  up  function  bodies  in  L  are 
language-specific,  and  therefore  the  per-module  ones. 
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A  program  P  has  behavior  b  in  context  C  and  memory  m,  at  entry 
point  v  with  arguments  v,  if  either  (i)  P  is  initializable,  in  linked  seman¬ 
tics  C{C,\P\),  to  a  configuration  (c,m)  that  has  behavior  b,  or  (ii)  b  = 
Going  wrong  and  P  is  not  initializable  (P  initially  went  wrong  in  context  C). 
Program  P  terminates  if  it  has  behavior  Termination. 

If  each  of  the  pairs  Su  T,  in  a  multimodule  program  is  related  by  struc¬ 
tured  simulations  [S'i]  A  [Tj],  then  the  linked  source  and  target  programs 
are  reach-closed  contextual  equivalent. 

Corollary  7  (Simulation  Implies  Contextual  Equivalence).  Let 

•  Ps  =  So,  Si,  ■■■  ,  Sn- T,  and 

•  Pt  =  To,  T\,  •  •  •  ,  Tv- 1 

for  reach-closed  source  modules  So,  S\,  •  •  •  ,  5v_i  with  equal  global  domains,  and 
valid  deterministic  target  modules  To,  T\,  ■  ■  ■  ,  Tv- 1-  If  for  each  i,  [S'*]  -<  [Tj], 
then  Pg  ~rc  Pt- 

Proof.  The  Coq  proof  is  file  compcomp/linking/context.equiv .  v.  □ 

In  the  above,  we  assume  closing  contexts  C  (those  that  do  not  themselves 
call  external  functions  not  defined  by  any  of  the  modules;  callbacks  into  Ps 
and  Pt  are  permitted).  C  must  also  be  well-defined  (cf  Definition  8).  Safety 
of  the  source  linked  program  and  determinism  of  the  target  modules  are 
required  to  prove  the  backward  direction  of  the  equivalence  (the  forward 
direction  holds  without  these  assumptions). 

We  also  have  a  form  of  contextual  refinement.4 

Definition  11  (Reach-Closed  Contextual  Refinement). 

Pt  Ere  Ps  —  V C  j  m  tm  ge  v  vg  vf- 

well-defined  C  A  initJnv  j  ge  vg  m  ge  vf  tm 
= =>•  {C,PT,ge,v,vf,tm)  <beh  (C,Pg,  ge,  v,vg,m) 

In  the  definition,  the  <  relation  of  Section  5.2.3  is  overloaded  to  operate 
on  programs,  in  addition  to  behaviors.  The  relation 

( C,PT,ge,v,vf,tm )  <beh  ( C,Ps,ge,v,vs,m ) 

means:  for  all  behaviors  E  of  program  Pt  in  context  C,  initialized  at  v 
with  arguments  vf  in  memory  tm,  there  exists  a  behavior  bg  of  Pg  in 
context  C  (initialized  at  v,  . . . )  such  that  W  is  a  refinement  of  bg  {bT  < beh 


4Strictly  speaking,  not  derivable  from  the  equivalence  shown  above  (Defini¬ 
tion  9;  we  will  refine  divergence  as  well  as  termination  behavior). 
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bg).  Refinement  of  behaviors  is  defined  as  in  Section  5.2.3.  Up  to  classical 
reasoning,  every  program  has  at  least  one  behavior: 

Theorem  6  (Behavior  Exists).  For  all  programs  P,  global  environments  ge,  entry 
points  v,  initial  arguments  v,  initial  memories  m,  and  contexts  C,  there  exists  a 
behavior  b  such  that  (C,  P,  ge,v,v,m)  has  behavior  b. 

Proof.  In  Coq.5  □ 

We  then  get  that  simulation,  as  in  Corollary  7,  implies  reach-closed 
contextual  refinement. 

Corollary  8  (Simulation  Implies  Contextual  Refinement).  Let 

•  Ps  =  So,  Si,  ■■■  ,  Sn- T,  and 

•  Pt  =  To,  T\,  -  ■  ■  ,  TN- i 

for  reach-closed  source  modules  So,  S\,  -  ■  ■  ,  5V-i  with  equal  globals  domains,  and 
valid  deterministic  target  modules  To,  T\,  ■  ■  ■  ,  T^~  i-  If  for  each  i,  [S'*]  A  [Tj], 
then  Pt  Crc  Pg. 

Proof.  The  Coq  proof  is  file  compcomp/linking/ context_equiv .  v.  By 
Theorem  5,  we  have  C(C,  IE’s1])  <  C(C,  [Pr]).  By  assumption  and  Theo¬ 
rem  3,  we  have  deterministic  C(C,  [Py] ).  The  theorem  follows  by  Corollary  5 
of  Chapter  5  (behavior  refinement  from  whole-program  simulation).  □ 

In  the  definitions  ~rc  and  Crc  above,  it's  important  that  the  definition 
of  well-defined  contexts  is  not  too  narrow.  Otherwise,  we  risk  ruling  out 
reasonable  programs.  At  the  very  least,  every  C-program  context  should 
be  well-defined  in  the  sense  of  Definition  8.  Otherwise,  the  equivalence  ~rc 
would  be  quite  weak  (it  would  not  be  robust  to  linked  C-language  contexts). 
As  justification,  I  have  proved  the  following. 

Theorem  7  (Clight  Programs  are  Well-Defined  Contexts).  Take  Clight  pro¬ 
gram  P.  The  interpretation  of  [PJ  as  module  semantics  is  ivell-defined  according 
to  Definition  8. 

This  theorem  lower-bounds  the  qualification  over  contexts  in  ^rc:  C  is  at 
least  instantiable  by  any  relation  that  corresponds  to  a  well-defined  Clight 
program. 

Since  C  may  express  arbitrary  relations  in  Coq's  Gallina,  up  to  the 
conditions  Definition  8,  there  are  contexts  C  that  correspond  to  no  well- 
defined  Clight  program,  yet  are  still  well-defined  according  to  Definition  8. 


5File  compcomp/linking/context.equiv .  v. 
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For  example,  take  C  equal  the  (undecidable)  relation  that  reads  as  input 
the  description  in  memory  of  a  Turing  machine  (and  its  input)  and  returns 
Vint  1  if  the  Turing  machine  terminates  (on  that  input),  and  Vint  0  otherwise. 
This  C  is  deterministic  (no  deterministic  Turing  machine  both  terminates 
and  infinite  loops),  self-simulates  (the  Turing-machine  description  is  an 
array  of  integers  in  the  heap,  pointed  to  by  one  of  C's  arguments),  and  is 
both  reach-closed  and  valid  ( C  could  be  implemented  to  write  only  integers 
to  a  sequence  of  memory  regions  it  allocates  itself). 

The  proof  6  of  Theorem  7  relies  on  a  number  of  auxiliary  lemmas,  corre¬ 
sponding  one-for-one  with  the  conditions  of  Definition  8. 

Lemma  7.  Clight  programs  are  deterministic. 

Proof.  File  compcomp/ cfrontend/ Clight_lemmas  .  v.  Due  to  a  misspec- 
ification  in  original  CompCert  of  certain  compiler  intrinsics  (architecture- 
dependent  instructions  for  certain  64-bit  operations)  the  proof  of  this  theo¬ 
rem  currently  assumes  the  164  helpers  case.  □ 

Lemma  8.  For  all  Clight  programs  P,  [P]  ■<  [P]. 

Proof.  File  compcomp/cf rontend/Clight_self_simulates  .  v.  □ 

Lemma  9  (Clight  Programs  are  Valid).  Every  Clight  program  is  valid,  in  the 
sense  of  Definition  3. 

Proof.  File  compcomp/linking/clight_nucular  .  v.  □ 

Theorem  1  proved  that  all  Clight  programs  are  reach-closed. 

6.2.2  Linking  Invariants 

The  main  difficulty  in  proving  Theorem  5  and  by  extension.  Corollaries  7 
and  8,  is  in  devising  a  simulation  invariant7  to  relate  the  stacks-of-cores 
runtime  states  of  the  linked  programs  I  f  and  Py. 

The  situation  is  presented  schematically  in  Figure  6.2.  In  the  source 
linked  program,  we  have  a  stack  of  core  states,  growing  downwards,  with 
c  in  callee  position  with  respect  to  a  (direct  or  indirect)  caller  core  cq,  which 
may  be  implemented  in  a  different  language.  We  must  relate  this  stack  of 
cores  to  the  corresponding  stack  in  the  target  linked  program.  We  use  p 
to  denote  the  structured  simulation  that  relates  the  callees  c  and  d,  and 


6File  compcomp/linking/context .  v. 

7File  compcomp /I  ink  ing/linking_inv .  v. 
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Stack 

Growth 


Source  Stack  Target  Stack 


Figure  6.2:  Schematic  representation  of  the  stacks-of-cores  linking  invariant. 
The  white  boxes  are  core  states.  Source  core  c  and  target  core  d  are  callees 
at  the  bottom  of  the  LinkedState  callstack,  related  by  structured  injection  ji 
(memory  is  elided).  Cores  cq  and  do  are  caller  cores  related  by  v. 


v  to  denote  the  injection  that  relates  callers  Co  and  r/0  -  In  the  figure,  we 
elide  the  memories  (for  callers,  the  memory  at  the  call  point  is  existentially 
quantified).  A  caller  core  may  be  a  callee  with  respect  to  another  caller 
higher  on  the  callstack. 

The  key  rely-guarantee  condition  is  to  ensure  that  blocks  labeled  as 
foreign,  or  leaked-in,  by  callee  injections  ji  are  always  labeled  as  public  by 
caller  injections  v: 


foreigner  D  owned^ v  C  public^  v  (6.1) 

From  the  fact  that  source  modules  are  reach-closed 

Eg  C  REACH  m  (roots  ge  r )  (6.2) 

we  then  can  show  that  the  memory  effects  of  the  running  callee  core  at 
the  top  of  the  callstack  are  confined  to  callee-allocated  (owned)  and  foreign 
blocks.  This  implies  that  private  caller  memory  regions  in  v,  which  are 
disjoint  from  the  blocks  marked  as  public  by  v ,  remain  unmodified. 

A  difficulty  here  is  how  to  relate  the  root  sets  of  source  modules  to  the 
visible  sets  vis^  used  in  the  simulation  relations.  We  do  this  by  maintaining 
the  following  two  invariants: 


roots  ge  r  C  vis g  ji  (6.3) 

REACH  m  (vis s  ji)  C  viss  }i  (6.4) 

Invariant  (6.3)  says  that  the  root  set  of  the  source  semantics  is  a  subset  of 
the  visible  source  blocks  in  ji.  This  invariant  holds  initially,  when  r  is  first 
created,  and  is  maintained  at  external  function  calls  and  returns.  Condi- 
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tion  (6.4),  which  we  maintain  as  an  invariant  of  all  structured  simulations, 
says  that  the  visible  set  is  closed  under  reachability.  These  two  conditions, 
plus  (6.2)  and  monotonicity  of  the  REACH  relation,  imply  that  Eg  is  a  subset 
of  visg  }i.  This  fact,  together  with  condition  (6.1)  above,  is  sufficient  to  prove 
the  unchangecLon  relies  of  Figure  5.4  at  the  point  at  which  the  running  core 
returns  to  its  calling  context. 

Linking  Invariants:  Details.  More  formally,  we  define  the  toplevel  simu¬ 
lation  invariant  (called  match  state  below)  that  relates  linked  states  x\ 
(in  the  source)  and  X2  (in  the  target)  as  follows: 

1  match_state  ji  (x\  :  LinkedState  N  moduless )  m 

2  (x2  :  LinkedState  N  modulesT )  tm  = 
let  si  =  xi .stack  in 

4  let  S2  =  a^.stack  in 

let  pfi  =  callStack_nonempty  si  in 

6  let  p/2  =  callStackmonempty  S2  in 

7  let  c  =  Stack. head  pf\  si  in 

8  let  d  =  Stack,  head  p/2  S2  in 

9  (3  (p/  :  c.idx  =  rf.idx)  p.  headJnv  c  d  pf  }i  p  m  tm 

A  tail  inv  fi  (pop  si)  (pop  52)  m  tm) 

A  x\.  pit  =  X2-  pit 

12  A  Vi  :  7/y.  valid_genv  ( modulesx  z).ge  tm 

Recall  here  that  the  core  states  of  linking  semantics  are  defined  (cf.  Chap¬ 
ter  4)  as  records  of:  a  procedure  linkage  table  (pit)  and  a  (heterogeneous) 
stack  of  core  states,  corresponding  to  dynamic  invocations  of  the  modules 
in  the  program. 

Record  LinkedState  (N  :  pos)  ( modules  :  In  — >  Modsem)  = 

{  pit  :  ident  — »  option  In', 
stack  :  Stack  (Core  N  modules )  } 

The  modules  map  pairs  integers  in  the  range  0  to  iV  —  1  with  the  Modsem 
module  semantics  associated  with  each  module  in  the  program.  The  invari¬ 
ant  is  also  parameterized  by  a  number  of  structured  simulation  relations 
one  for  each  compiled  module  in  the  program.  We  index  ^tdx  to  denote 
the  simulation  relation  for  module  number  idx. 

The  first  few  lines  of  the  match _state  invariant  just  introduce  new  names 
for  the  various  parts  of  the  linked  states  x\  and  .7,2 .  The  let-bound  variables 
c  and  d  are  defined,  via  Stack. head,  as  the  cores  on  top  of  the  source  and 
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target  stacks  respectively,  as  in  Figure  6.2.  The  names  s\  and  S2  are  aliases 
for  the  stack  component  of  linked  states  x\  and  X2-  pfi  and  p/2  are  proofs 
that  stacks  si  and  S2  are  nonempty  (an  invariant  of  linking  semantics,  as 
defined  in  Chapter  4).  These  proofs  are  passed  as  arguments  to  Stack. head, 
to  ensure  that  Stack. head  is  total.  Line  11  states  that  source  and  target  linked 
states  contain  equal  PLTs.  Line  12  asserts  that  each  target  module  global 
environment  is  valid  with  respect  to  tm  (the  environment  does  not  map 
globals  to  addresses  that  were  never  allocated  in  tm). 

The  key  section  of  the  invariant  runs  from  line  9  to  line  10.  Line  9  is  the 
invariant  head  inv  on  the  topmost  cores  on  the  stack — c  and  d.  Line  10  is 
the  invariant  taiLinv  on  the  remaining  (suspended)  cores — pop  si  and  pop  S2- 
Existentially  quantified  are:  a  proof  pf  that  that  the  module  index  of  core  c 
equals  the  index  of  core  d,  and  a  list  of  frame  packages  p  that  relate  each  pair 
of  source-target  suspended  (caller)  cores  on  the  source  and  target  stacks. 
Frame  packages  are  records 


p  £  frame.pkg  =  mk_frame_pkg 
{  frame.p  :  Structured  Injection; 
frame.m  :  mem; 
fram eJm  :  mem; 

framewal  :  valid  frame_p  frame_m,  fram  eJm  } 


that  contain: 

•  frame_p:  a  structured  injection  relating  two  (source-target)  suspended 
cores; 

•  frame.m:  the  source  memory  at  which  the  cores  are  related; 

•  fram e_tm:  the  target  memory  at  which  the  cores  are  related;  and 

•  framewal:  a  proof  that  frame_p  is  valid  for  frame_m.  and  fram  eJm. 

Think  of  frame.mo  and  frame  J/r/Q  as  the  source  and  target  memories,  respec¬ 
tively,  in  which  this  particular  pair  of  source-target  (indirect)  caller  cores 
(say  co  and  do  as  in  Figure  6.2)  were  suspended,  waiting  on  an  external 
function  call,  frame  p  is  likewise  the  structured  simulation  that  related  the 
source-target  caller  cores  co  and  do  at  the  point  of  suspension.  We  describe 
headJnv  and  taiLinv  in  turn. 


Running  Cores.  The  invariant  head  Jnv  that  holds  of  the  topmost  cores  on 
the  stack — c  and  d — is  defined  as  follows: 
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1  heacLinv  c  d  pf  pi  p  m  tm  = 

2  let  idx  =  c.idx  in 

3  ( c ,  m)  ( d ,  tm) 

A  (Vp  £  p.  callee_callerJnv  m  p  p ) 

A  (35.  roots  ge  B  C  vis 5  p  A  7^^  c  m  B) 

6  A  domy  p  =  valid.blocks  tm 

7  A  d  tm 

The  parameters  of  the  definition  are:  c  and  d,  the  source  and  target  running 
cores  respectively;  pf,  a  proof  that  the  modules  indices  of  c  and  d  are  equal; 
the  structured  injection  p  that  relates  c  and  d;  the  frame  packages  p  that 
relate  the  remaining  suspended  cores;  and  the  source  and  target  memories 
m  and  tm. 

Line  3  asserts  that  configuration  c,  m  is  related  to  d,  tm  by  the  sim¬ 
ulation  relation  associated  with  module  idx,  indexed  by  structured  injection 
p.  In  line  5,  we  state  that  there  exists  a  block  set  B  such  that  (1)  the  roots 
of  B  and  global  environment  ge  are  a  subset  of  the  visible  source  blocks 
of  p  (this  is  Condition  6.3  above);  and  (2)  B  satisfies  the  invariant  7ZU]X 
maintained  by  the  reach-closed  semantics  of  source  module  idx.  Line  6  is 
a  technical  condition  on  the  target  blocks  of  p  (asserting  that  dom  y  p  is 
always  the  set  of  valid  blocks  in  tm).  Line  7  maintains  the  validity  invariant 
T/tix  associated  with  the  semantics  of  target  module  idx. 

Predicate  callee_caller_inv  on  line  4  delineates  the  relation  between  the 
structured  injection  p  and  the  frame  packages  p  £  p  that  relate  the  sus¬ 
pended  cores  on  the  stack,  at  memory  m.  It  is  defined  as  follows: 

1  callee_caller_inv  m  p  p 

2  let  po  =  p.frame^/  in 

3  let  mo  =  p.frame.m  in 

4  let  tmo  =  p.frame.fm  in 

5  po  C  p 

6  A  separated  po  pi  mp  tmo 

7  A  owned g  po  D  owned 3  p  —  0 

8  A  owned  ^Po  D  owned  tR  —  0) 

9  A  REACH  m  (vis,s  p)  fl  (owned#  po )  C  public#  po 

The  assertion  po  C  p  on  line  5  ensures  that  p  extends  po  with  respect 
to  the  Kripke-order  C.  This  order  is  similar  to  Lus  and  Cthem:  p  may  map 
more  blocks  than  po,  in  order  to  deal  with  allocations,  but  is  otherwise  equal 
to  po  (wherever  po  is  defined).  The  primary  difference  is  that  □  does  not 
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distinguish  between  owned  and  extern  blocks  in  fiQ  and  ft,  instead  treating  ;/q 
and  ji  as  if  they  were  "unstructured"  injections,  in  standard  CompCert  style 
(cf  Chapter  2).  Why  is  the  unstructured  C  appropriate  here?  ftQ  and  ft  relate 
the  running  states  of  different  source-target  module  pairs,  for  example,  of 
compiled  modules  idx  and  idx' .  The  blocks  labeled  owned,  or  allocated,  by 
module  idx  will  be  labeled  extern  (not  owned)  by  module  idx' ,  and  vice  versa. 

Lines  6  to  9  give  the  other  key  invariants:  ft  must  be  separated  from  ftQ, 
with  respect  to  the  existentially  quantified  memories  m q  and  tm o  associated 
with  fiQ  (see  Figure  5.5  for  the  definition  of  separated);  the  source /target 
owned  blocks  of  fiQ  and  ft  must  be  disjoint  (Lines  7  and  8);  finally,  the  set  of 
source-language  blocks  declared  owned  by  the  caller  injection  fiQ  but  also 
reachable  in  m  from  the  visible  set  of  the  callee  injection  ft  must  also  be 
declared  public  by  fiQ.8  In  other  words,  blocks  leaked  into  ft's  visible  set  at 
previous  interaction  points  must  be  declared  public  by  the  (direct  or  indirect) 
caller  that  owns  the  blocks  in  question. 

The  intuition  here  is:  A  well-formed  source-target  state  pair  of  the  linked 
program  is  one  in  which  the  simulation  relation  fiQ  (relating  cores  co  and 
do)  makes  no  claim,  at  external  calls,  on  the  values  contained  in  blocks 
that  have  been  leaked  to  the  cores  in  callee  position  with  respect  to  co,  do ■ 
(Recall  that  structured  simulation  proofs  may  assume  that  the  memory  is 
unchanged  over  external  calls  only  at  private  blocks,  as  in  the  External  Steps 
case  of  Figure  5.4.)  The  reason:  leaked  regions  may  be  modified  over  the 
external  call,  e.g.,  by  the  running  cores  c  and  d. 


Suspended  Cores.  The  invariant  that  relates  suspended  (caller)  cores  to 
one  another  is  defined: 


l  taiLinv  p  s\  S2  m  tm  = 

alLcaller.callees  (A p  pq.  caller_callee_inv  m  (frame.;/  p )  po)  p 
3  A  frame.all  p  m  tm  s\  S2 


where  all  caller  callees  is  given  by  the  following  pair  of  recursive  functions: 


8We  impose  an  analogous  invariant  on  target  blocks  (not  shown);  see  the  code 
repository  that  accompanies  this  thesis  for  details. 
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all  T  P  (i  :  list  T)  = 
case  l  of 
|  nil  — »  True 

j  a  ::  l'  -A  P  a  A  all  P  l' 

all_caller_callees  ( T  :  Type)  (P  :  T  — >  T  — >  Prop)  (Z  :  list  T)  = 
case  l  of 
|  nil  — >  True 

|  a  ::  l'  -A  all  (P  a)  l'  A  all_caller_callees  T  P  l' 


Line  2  of  the  taiLinv  listing  above  asserts  that,  for  each  p  C  p,  the  structured 
injection  given  by  package  p  (frame_/i  p)  is  related  by  caller_callee_inv  to  each 
caller  package  po  in  the  tail  of  p  at  the  point  at  which  p  appears  (i.e.,  in  caller 
position  with  respect  to  p ).  This  invariant  is  required  in  order  to  re-establish 
head  inv  when  the  running  source-target  cores  return  to  their  callers. 

Line  3  of  taiLinv  asserts  an  invariant  on  each  source-target  pair  CQ,do 
in  the  source-target  stacks  si  and  s^,  as  defined  by  the  recursive  predicate 
frame.all: 


frame_all  p  m  tm  s\  si  — 
case  p,  si,  S2  of 

|  mk_frame_pkg  po  m0  tm0  -  P' .  Co  ::  s[,  <1q  ::  s'2  — > 
3  ( pf  ■  co-idx  =  do-idx). 

3  e-[  v\. 

3  e2  V2.. 

framejnv  Q)  do  pf  }Lq  uiq  m  e\  v\  tmo  tm  e2  V2. 
A  frame.all  p'  m  tm  s2 

|  nil,  nil,  nil  — >  True 
I  — >  False 


In  addition  to  asserting  that  p,  s\,  and  S2  are  all  the  same  length,  frame  all 
applies  a  subsidiary  invariant,  frame  inv,  to  each  pair  of  cores  cq,  do  in  si 
and  6'2-  This  "per-frame"  invariant  is  defined  as  follows: 


6.2.  HORIZONTAL  COMPOSITION 
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1  framejnv  fi o  tttq  m  e\  v[  tmo  tm  e 2  v2  = 

2  let  Vo  =  leak.out  jiQ  v\  v2  in 

3  let  idx 0  =  co-idx  in 

4 

5  (*  per— frame  invariants,  on  co,  do,  fiQ,  mo,  and  tmo  *) 

6  inject  (asJnj  }io)  mo  tmo 

7  A  valid  piQ  mg  trriQ 

8  A  (co,mo)  (do,  tmo) 

9  A  at.external  co  =  Some  (e\,v\) 

10  A  at.external  do  =  Some  (e2,v2) 

A  inject  (as  inj  ]i0  [vis  Vo)  v\  v2 

12 

(*  source  visibility  *) 

14  A  ( 3B .  roots  ge  B  C  vis^  }Iq  A  7 cq  mo  B ) 

15 

16  (*  target  validity  *) 

17  A  domy  }Iq  —  validBlocks  tmo 

is  A  lldXQ  do  tmo 

19 

20  (*  invariants  relating  mo,  tmo  to  m,  tm  *) 

21  A  forward  mo  m 

22  A  forward  tmo  tm 

23  A  unchanged.on  {(b,z)  \  own^Vo  b  =  Priv}  mo  m 

24  A  unchanged.on  (local_out_of .reach  Vq  mo)  tmo  tm 

Lines  2  and  3  establish  local  definitions: 

•  vo  is  the  injection  that  results  by  "leaking  out"  into  }io  all  blocks 
exposed  by  cq  and  do  to  their  callers  (recall  that  each  pair  of  cores  cq, 
do  is  suspended  at.external  on  an  external  function  call); 

•  idx 0  is  the  index  of  the  module  from  which  co  was  spawned  (which 
happens  to  be  equal  to  the  index  of  do,  by  the  existentially  quantified 
pf  on  line  4  of  frame.all). 

The  other  invariants  form  four  natural  groups: 

Lines  6  to  11  specify  "per-frame"  invariants  on  the  states  co  and  do,  the 
memories  mo  and  tmo,  and  the  structured  injection  }Iq  that  relates 
them.  Configurations  cq,  mo  and  <7q,  tmo  should  be  related  by  , 
the  simulation  invariant  of  module  idxo .  In  addition,  }Iq  injects  mo  to 
tmo  and  v\  to  v2,  for  v\  the  arguments  of  co's  call  to  external  function 
e\  (at.external  co  =  Some  (e\,v\))  and  v2  the  arguments  of  do's  call  to 
external  function  e2  (at.external  do  =  Some  (e2,v2)). 
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Line  14  maintains  that  invariant  that  relates  the  visible  source  blocks  of  }Iq 
to  the  reach-closed  invariant  Rl(ix<)  (of  source  module  idx o  associated 
with  co). 

Lines  17  and  18  parallel  lines  6  and  7  of  the  definition  of  headjnv. 

Lines  21  to  24  relate  memories  niQ  and  trriQ  to  the  "active"  memories  m 
and  tm.  For  example,  m  and  tm  should  be  forward  from  mo  and  t/niQ 
respectively.  But  also,  mo  and  m  must  be  equal  in  blocks  marked 
private  by  Vq,  and  likewise  for  tmo  and  tm  at  "out  of  reach"  locations. 
These  last  two  conditions  directly  match  the  unchanged.on  conditions 
of  the  External  Steps  diagram  of  structured  simulations  (Figure  5.4). 


Chapter 


7  - 

Modular  Verification 


Verifiable  C  [ADH+14,  Chapter  24]  is  a  separation  logic  for  CompCert's 
Clight  language  that  supports  higher-order  features  such  as  stored  function 
pointer  specifications.  This  chapter  connects  the  Verifiable  C  logic  to  the 
compiler  correctness  results  I  presented  in  Chapter  6.  In  particular,  I  show 
how  modular  separation  logic  proofs  in  Verifiable  C  can  be  connected  to 
the  linking  semantics  of  Chapter  4  and  in  this  way,  composed  with  compiler 
correctness.  At  a  high  level,  the  result  is:  independent  program-logic  proofs 
of  the  Hoare  triples 

hi  b func  fl,fl  ■  r 


and 


^2  b func  h,h,f5  '■  r 

in  which  /i,/2, . . .  are  function  bodies,  T  proved  function  specifications,  and 
Ti,r2  assumed,  imply  partial  correctness  of  the  linked  target  program 


C(C,  [CompCert(/1,/2)]AsmSem/  [CompCert(/3,/4,/5)jAsmsem) 

that  results  from  independently  compiling  /i,/2  and  ./3,  ./a,  ,/s ■  The  linked  C 
here  is  a  program  context  compatible  with  the  Hoare-style  specifications  of 
the  external  functions  called  by  (but  implemented  by  none  of)  the  compiled 
modules,  as  encapsulated  in  the  Hoare-logic  function  specifications  Ti  U  T2. 
The  remainder  of  this  chapter  explains  the  details. 
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7.1  Modular  C  Program  Logic 

In  some  ways.  Verifiable  C  is  just  a  conventional  Hoare  logic.  Judgments 
have  the  following  familiar  form: 

A  b  {P}  c  {R} 

in  which  A  is  a  type  context,  P  is  the  precondition  of  C  statement  c,  and  R 
is  the  postcondition. 

At  the  same  time,  the  Verifiable  C  logic  is  also  quite  complex,  owing  to 
the  complexity  of  the  C  programming  language.  For  example,  A  does  not 
just  give  the  types  of  variables  that  may  appear  in  free  in  c  (i.e.,  temporaries). 
It  also  types 

•  function  parameters, 

•  addressed  local  variables,  and 

•  global  variables 

and  assigns  pre-  and  postconditions  to  the  functions  defined /called  by  the 
program.  The  postcondition  R  is  really  a  series  of  postconditions.  Since 
C  basic  blocks  may  exit  in  multiple  ways  (by  continue,  break,  return, 
and  by  falling  through  a  switch),  R  records  multiple  postconditions,  one 
for  each  possible  return  case. 

7.1.1  Inference  Rules 

Chapter  24  of  [ADH+14]  describes  the  inference  rules  of  the  Verifiable  C 
logic  in  detail.  For  the  most  part,  the  rules  are  conventional  (if  complicated 
by  the  vagaries  of  C).  For  example,  here  is  the  rule  for  load  from  memory: 

readable  n 

A  b  (o(e  A  v  *  P)}  x  :=  [e]  {3void.  x  =  v  A  (e  A  v  *  P)[v0id/x]} 

(SemaxLoad) 

The  command  x  [e]  assigns  to  x  the  value  in  memory  at  location  e,  equal 
v  as  specified  by  precondition  e  i— >  v.  e  ha  v  is  an  instance  of  the  maps-to 
predicate  of  separation  logic,  asserting  a  (singleton)  heap  containing  value 
v  at  the  location  to  which  expression  e  evaluates,  n  is  a  share,  the  program- 
logic  counterpart  to  the  CompCert  permissions  of  Chapter  2.  Chapter  41 
of  [ADH+14]  describes  how  shares  are  constructed  in  Verifiable  C;  Chapter 
42,  of  which  I  am  a  co-author,  describes  how  shares  are  erased  to  CompCert 
permissions.  In  the  rule  above,  we  just  need  that  n  is  a  readable  share  (giving 
at  least  read  permission;  n  might  permit  writes  as  well). 
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The  precondition  t>(e  A  v  *  P)  is  prefixed  with  the  later  operator  >. 
This  >  highlights  another  aspect  of  the  Verifiable  C  logic:  in  order  to  reason 
about  complicated  patterns  of  (mutual)  recursion  over,  e.g.,  C  function  point¬ 
ers,  the  interpretation  of  the  logic  is  step-indexed  [AM01,  AMRV07,  AAV02, 
HDA10].  In  the  model  of  the  logic,  predicates  (and  states)  are  paired  with 
natural  numbers  k,  the  number  of  steps  for  which  the  predicate  will  con¬ 
tinue  to  make  claims  on  the  system.  \>P  says  that  P  holds  not  at  the  current 
k  (e.g.,  of  the  state  in  which  the  command  is  executed),  but  only  at  k  —  1.  In 
the  load  rule,  we  need  only  that  >(e  A  v  *  P ),  as  opposed  to  the  stronger 
e4n  *  P,  because  the  assignment  x  :=  [e]  itself  takes  a  step. 

Assertion  R  =  3v0id .  x  =  v  A  (e  A-  v  *  P)[v0id/ x\  is  the  strongest 
postcondition  of  the  load  command.  It  states  that,  after  the  assignment, 

•  variable  x  equals  v,  and 

•  there  exists  an  old  value  of  x,  call  it  v0id,  such  that  the  precondition 
(e  i — y  v  *  P )  holds  with  v0id  substituted  for  x. 

This  particular  load  rule  is  the  most  general.  In  the  logic,  other  special¬ 
ized  forms  are  derived,  for  loading  from  arrays,  structures,  etc.,  and  for  the 
special  case  in  which  x  does  not  appear  free  in  the  precondition. 

7.1.2  Proving  Whole  Modules 

In  addition  to  the  basic  Hoare  triple,  the  Verifiable  C  logic  provides  a 
judgment  form,  semax.func,  for  composing  function  body  proofs  in  order  to 
prove  whole  modules.  This  judgment  has  form: 

V,T  Lfunc  /  :  T' 


in  which 

varspecs  V  is  an  environment  mapping  (global)  variables  to  their  types. 

funspecs  T,  T'  are  lists  of  function  name,  function  specification  (funspec)  pairs. 
T  are  the  function  specifications  that  one  may  assume  (but  only  later ) 
when  proving  functions  f.  T1  are  the  specifications  that  are  proved.  A 
single  funspec  is  defined  by  the  inductive: 


Inductive  funspec  :  Type  = 

|  mk.funspec  :  funsig  — > 

V  A  :  Type. 

V  (P  :  A  — »  environ  — >  mpred) 

(Q  :  A  — >  environ  — >  mpred).  funspec 
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The  first  constructor  argument  (of  type  funsig)  is  the  function's  (C-type) 
signature,  environ  is  a  triple  of  the  global  environment,  the  function 
temporaries  environment  (unaddressed  locals),  and  the  function  vari¬ 
able  environment  (addressed  locals).  A  is  a  (universally  quantified) 
type  that  is  used  to  relate  values  ( e.g .,  of  program  variables)  between 
pre-  and  postconditions  P  and  Q.  mpred  is  the  type  of  predicates  on 
(program  logic)  memory  states.  Section  7.2  will  describe  the  relation 
of  program  logic  memories  (called  rmaps)  to  CompCert's  memories. 

fundefs  /  is  a  list  of  function  name,  function  definition  (fundef)  pairs.  Each 
fundef  is  either  an  Internal  function  definition  (a  C  function  proved 
correct  in  this  module)  or  an  External  function  declaration: 

Inductive  fundef  :  Type  = 

|  Internal  :  function  — >  fundef 
|  External  :  ident  — >  typelist  — »  type  — >  fundef 

Internal  functions  are  defined  as  in  Section  3.2.1.  Here  is  the  corre¬ 
sponding  Coq  definition,  which  is  for  the  most  part  self-explanatory: 

/  £  Record  function  :  Type  = 

{  frureturn  :  type; 
fmparams  :  list  (ident  *  type); 
fn.vars  :  list  (ident  *  type); 
fn.temps  :  list  (ident  *  type); 
fn.body  :  statement  } 

type  is  the  type  of  C  types,  fn  body  is  the  actual  body  of  the  function 
(a  C  statement).  The  distinction  between  fn.vars  and  fn.temps  is:  the 
vars  are  addressed  (and  therefore  stack-allocated),  while  temps  are 
nonaddressed  (and  therefore  allocated  in  registers,  or  spilled  into  the 
stack  by  the  compiler). 

External  functions  are  just  function  type  declarations,  as  one  would 
see,  e.g.,  in  a  C  header  file.1 

The  inference  rules  of  the  semaxTunc  judgment  are: 

-  (SemaxFuncNil) 

V,T  Efunc  ml  :  ml 


1  This  statement  is  a  slight  simplification.  CompCert's  external  function  type 
also  models  special-purpose  functions  such  as  compiler  intrinsics. 
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id  £  map  fst  F  id  <£  map  fst  fs 
var_sizes_ok  (fruvars /)  preconditiomclosed  /  P 
vj  body  f  ■  H,  mkTunspec  (fn_funsig/)  A  P  Q)  V,T  Pfunc  fs  :  T' 

V,T  \rfunc  {id,  Internal  /)  ::  fs  :  {id,  mkTunspec  (fnTunsig /)  A  P  Q)  ::  T' 

(SemaxFuncInternal) 


id  £  map  fst  F  id  £  map  fst  fs  length  ids  =  length  f 

(V gx  {x  :  A)  {vret  :  option  V)  <p.  Q  x  (make_ext_rval  gx  vret)  (p  — =>■  welltyped  r  vret) 
O  ^ ~  ext  {ids, fa)  :  ^ ~ func  fs  •  r 

V,T  \~func  {id,  External  f  r)  ::  fs  :  {id,  mk.funspec  (zip  ids  f ),  r)  ^4  P  Q)  ::  T' 

(SemaxFuncExternal) 


SemaxFuncNil  serves  as  the  base  case  of  a  whole-module  proof:  the 
empty  list  of  functions  satisfies  the  empty  list  of  specifications. 

SemaxFuncInternal  is  the  rule  for  verifying  an  Internal  definition — a 
function  defined  in  the  current  module.  The  key  hypothesis  is  the 
subsidiary  judgment: 

V,  F  h body  f  :  {id,  mkTunspec  (fnTunsig/)  A  P  Q) 

called  semax  body,  which  states  what  it  means  for  a  function  body  to 
satisfy  its  specification,  semax.body  is  defined  in  terms  of  Verifiable 
C's  underlying  Hoare  judgment: 

V,T  \~ body  f  '■  {id,  mkTunspec  sig  A  P  Q)  = 

\/x  :  A.  func.tycontext  fVT\~ 

{P  x  *  stackframe.of  /}  fn.body  / 
{function_body_ret_assert  (fn.return  /)  {Q  x) 

*  stackframe_of  /}) 

func  tycontext  constructs  the  appropriate  A  from  f,  V,  and  F.  The 
predicate  stackframe.of  gives  the  shape  of  the  stack  (in  memory)  for 
function  f,  and  is  ^-conjoined  in  both  the  pre-  and  postconditions. 
function_body_ret_assert  binds  the  function  return  value  in  Q  and  en¬ 
sures  that  the  function  returns  properly  (as  opposed  to,  e.g.,  break). 

SemaxFuncExternal  is  the  rule  for  "verifying"  external  functions.  Why 
do  we  need  such  a  rule?  External  functions  are  not  associated  with 
definitions  (not,  at  least,  in  the  module  currently  being  verified).  It 
stands  to  reason  that  we  should  be  able  to  assume  their  specifications, 
e.g.,  as  axioms,  at  least  for  purposes  of  the  current  module. 
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The  point  is  not  to  verify  the  functions  themselves  (that  will  be  done 
when  we  verify  the  remaining  program  modules),  but  instead  to  en¬ 
sure  that  the  function  specifications  P,  Q  assumed  in  T  for  this  module 
match  those  given  by  an  external  oracle.  This  matching  is  accom¬ 
plished  by  judgment 

O  \~ext  ( ids,  fid )  :  ( A,P,Q ) 

which  reads  "external  oracle  O  justifies  function  specification  P,  Q" 
( A  is  the  type  of  state  shared  between  P  and  0).  O  is  only  an  implicit 
argument  to  the  semaxjfunc  judgment;  I  do  not  write  it  explicitly  in  the 
conclusion  of  SemaxFuncExternal  or  the  other  inference  rules. 

The  oracle  is  shared  among  all  the  modules  in  the  program.  In  this 
way,  it  ensures  that  each  module  verification  agrees  on  shared  func¬ 
tion  specifications.  At  the  same  time,  the  external  specification  oracle 
is  language-  and  (mostly)  program-logic-independent,  meaning  it  can 
be  reused  even  in  the  proofs  that  connect  Verifiable  C  to  linking  se¬ 
mantics  and  Compositional  CompCert. 

What  is  the  shape  of  the  oracle?  Its  type  is: 

O  G  Record  externaLspecification  ( M  E  O  :  Type)  :  Type  = 

{  ext_spec_type  :  E  -A  Type; 
ext_spec_pre  :  V  ef  :  E. 
ext_spec_type  ef  — >  genviron  -A 
list  typ  — >  list  V  — »  n  — >  M  -A  Prop 
ext_spec_post  :  V  ef  :  E. 
ext_spec_type  ef  — »  genviron  -A 
option  typ  — »  option  V  — >  O  — »  M  -A  Prop; 
ext_spec_exit  :  option  V  — >  O  — »  M  — »  Prop  }. 

The  parameters  are:  M ,  the  type  of  memory  over  which  the  external  speci¬ 
fication  oracle  quantifies;  E,  the  type  of  external  function  names/ declara¬ 
tions;  Q,  the  type  of  external  oracle  states.  In  Verifiable  C,  E  is  typically 
specialized  to  CompCert's  externa Lfu notion.  The  M  parameter  is  variously 
CompCert's  mem  type  or  the  type  of  juicy  memories,  which  I  introduce  in 
Section  7.2. 

The  fields  of  the  record  are: 

ext_spec_type:  A  function  from  external  function  names  to  the  types  of  auxil¬ 
iary  state  shared  between  their  pre-  and  post-conditions.  For  an  exter¬ 
nal  function  name  ef,  ext  spec  type  ef  is  the  analog  of  A  in  a  Verifiable 
C  funspec  ( A ,  P,  Q ). 
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ext_spec_pre:  A  (dependent)  map  from  external  function  names  to  precondi¬ 
tions.  The  parameter  of  type  ext_spec_type  ef  is  dependent  on  the  par¬ 
ticular  ef  that  is  passed.  In  the  interpretation  of  external  specifications, 
this  second  parameter  is  shared  between  the  pre-  and  postcondition, 
genviron  is  a  map  from  global  identifiers  to  the  blocks  at  which  they  are 
allocated  in  memory,  list  typ  are  the  expected  (language-independent) 
types  of  the  function  arguments.2 


ext  spec  post:  A  map  from  external  function  names  to  postconditions,  anal¬ 
ogous  to  ext  spec  pre.  The  argument  of  type  option  V  is  the  (optional) 
return  value  to  function  ef. 


ext_spec_exit:  A  predicate  that  must  hold  at  module  exit  (i.e.,  at  return  from 
module  entry  points). 


With  external  specification  oracles,  one  can  extend  the  definition  of  safety 
given  for  closed  programs  (Section  5.2.2)  to  open  programs,  those  that  may 
call  external  functions.  Like  the  definition  of  5.2.2,  open  safeN  is  still  indexed 
by  a  natural  number  n,  which  gives  the  number  of  steps  for  which  we  will 
interrogate  the  system.  Unlike  that  previous  definition,  which  was  a  pure 
safety  condition,  open  safeN  also  imposes  a  postcondition  (ext_spec_exit)  at 
module  halt — making  open  safeN  a  partial  correctness  property.  The  other 
major  difference  is  the  addition  of  the  SafeN-External  case  for  external 
function  calls: 


-  (SafeN-Zero) 

safeN  0  co  c  m 


ge\~  c,m  i — »  c',  m'  safeN  n  to  c!  m' 
safeN  (n  +  1)  to  c  m 


(SafeN-Step) 


2We  do  not  use  C  types  here  because  external  specifications  are  intended  to  be 
language-independent,  typs,  as  opposed  to  C  types,  include:  int,  float,  long,  single. 
The  most  recent  versions  of  CompCert  also  include  any32  and  any64,  for  typing 
unknown  data  of  fixed  width. 
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at.external  c  =  Some  (ef,v) 
ext_spec_pre  O  ef  x  (genv_symb  ge)  (sig.args  ef)vcom 
( \/vret  m'  co'  n' .  n'  <  n  A  R(n' ,  m,  m') 

A  ext_spec_post  O  ef  x  (genv_symb  ge)  (sig.res  ef)  vret  co' m! 

— =>  3c7.  after.external  vret  c  =  Some  c'  A  safeN  n!  a /  c' m') 

safeN  (n  +  1)  co  c  m 

(SafeN-External) 

halted  c  =  Some  vret  ext_spec_exit  O  (Some  vret )  co  m 

safeN  n  co  c  m 

(SafeN-Halted) 

The  generalized  safeN  for  open  programs  is  a  predicate  of  type  IN  — >  O  — >• 
C  — >  M  -O  Prop.  The  n  :  IN  is  the  number  of  steps  for  which  configuration 
( c,m )  is  safe,  co  is  the  external  state  of  the  oracle,  or  outside  world. 

Most  of  the  rules  are  self-explanatory,  by  reference  to  the  definition 
of  closed  safeN  in  Section  5.2.2.  New  is  the  rule  for  external  function  calls, 
SafeN-External.  It  handles  the  case  in  which  a  core  state  c  is  at  external, 
calling  external  function  ef  with  arguments  v.  In  this  situation,  we  say 
( c,m )  is  safe  for  n  +  1  steps  when: 

•  v,  co,  and  m  satisfy  e/'s  precondition  (ext_spec_pre  O  ef  x. . . );  and 

•  for  all  return  values  vret,  new  memory  states  rn' ,  new  external  states 
co' ,  and  naturals  n'  such  that 

-  n'  <  n, 

-  n'  and  rn'  are  related  to  m  by  a  particular  relation  R  (which  I  will 
explain  in  a  moment),  and 

-  vret,  w' ,  and  m'  satisfy  e/'s  postcondition, 

running  after.external  c  to  inject  the  return  value  vr et  results  in  a  new 
state  c'  that  is  safe  for  n'  steps  in  co'  and  m' . 

The  intuition  for  SafeN-External  is:  A  configuration  ( c,m ),  calling 
external  function  ef,  is  safe  for  n  +  1  steps  when  it  is  safe  for  n'  <  n  steps 
in  any  state  the  external  world  may  return  after  executing  ef,  as  long  as 
the  returned  state  satisfies  the  postcondition  agreed  upon  in  O.  The  state 
exposed  by  ( c,m )  to  the  outside  world,  at  the  point  of  the  call  (the  memory 
m  and  the  function  arguments  v),  must  satisfy  e/'s  precondition  as  well. 

The  relation  R  used  above  differs  depending  on  the  type  M  at  which 
safeN  is  parameterized.  When  M  equals  CompCert's  mem,  R  is  just  An' m  m'. 
True  (meaning  we  must  prove  safety,  over  external  calls,  for  all  n'  <  n).  I  call 
this  definition  "dry  safety"  (dry .safe)  as  opposed  to  "juicy  safety"  (juicy .safe), 
for  reasons  that  will  become  apparent  in  Section  7.2. 

When  M  is  the  type  of  juicy  memories  (upcoming,  in  Section  7.2),  R  is 
specialized  to: 
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l{  n  m  m  — 

n'  —  level  ml  A  level  <  level  m 

A  (*  ...a  relation  on  the  function  specifications  embedded  in  m,  m! .  *) 

Safety  specialized  to  this  R  is  called  "juicy  safety."  The  natural  nl  must  equal 
the  step  index  of  juicy  memory  m';  also,  the  step  index  of  ml  must  be  strictly 
less  than  that  of  m.  These  two  conditions  are  step-index-related:  If  nl  (the 
number  of  steps  for  which  we  must  prove  safety  in  StepN-External) 
were  greater  than  level  m! ,  then  it  would  be  possible  for  level  m!  =  0  (the 
"we  have  given  up"  state),  while  n!  >  0  forces  us  to  continue  to  prove  safety 
for  some  nonzero  number  of  remaining  steps  n' . 


7.2  Juicy  Memories 

The  semax  func  judgment  I  outlined  in  the  previous  section  gives  a  proof 
theory  for  C  programs.  How  do  we  know  this  theory  is  sound?  When  we 
prove  V,  T  \~func  f  :  T,  for  funspecs  T  and  functions /,  who  guarantees 
that  the  functions  /  actually  satisfy  their  specifications?  Or  that,  for  a  given 
/  G  /,  if  we  initialize  /  in  a  state  satisfying  its  precondition,  it  will  either 
safely  run  forever3  or  halt  in  a  state  satisfying  its  postcondition? 

The  answer,  as  Part  VI  of  [ADH+14]  demonstrates,  is  to  construct  a 
semantic  model  of  the  Hoare  judgment,  and  then  prove  soundness  with 
respect  to  this  model.  In  this  section,  I  briefly  describe  enough  of  the  un¬ 
derlying  machinery  to  explain  how  the  semantic  model  of  the  Verifiable  C 
logic  is  connected  to  Compositional  CompCert  (Section  7.3). 


Juicy  Memories.  When  reasoning  in  a  program  logic,  step  indexes  are  un¬ 
problematic:  the  step  indexes  can  often  be  hidden  via  use  of  the  >  operator, 
and  do  not  often  appear  explicitly  in  assertions. 

How  to  connect  step-indexed  states  to  CompCert's  memories  (Chap¬ 
ter  2),  which  are  not  step-indexed?  One  could  simply  step-index  CompCert. 
But  this  strategy  makes  it  difficult,  at  least  naively,  to  prove  correctness  of 
compiler  phases  that  may  change  the  number  of  steps.  A  better  solution 
is  to  stratify  the  models  into  two  layers:  operational  states  corresponding 
to  states  of  the  operational  semantics,  and  semantic  ivorlds  appearing  in 
assertions  of  the  program  logic.  Juicy  memories  are  the  specialization  of  this 


3E.g.,  with  respect  to  the  definition  of  safety  just  given. 
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strategy  to  Verifiable  C  rmaps  (resource  maps.  Verifiable  C's  step-indexed 
model  of  the  state)  and  CompCert  memories. 

To  a  first  approximation,  a  juicy  memory  jm  defines  what  it  means  for 
an  rmap  p  to  erase  to  a  CompCert  memory  m.  By  erasure,  we  mean  the 
removal  of  the  "juice"  that  is  unnecessary  for  execution  (as  in  Curry-style 
type  erasure  of  simply  typed  lambda  calculus).  The  "juice"  has  several 
components:  permission  shares  controlling  access  to  objects  in  the  program 
logic;  predicates  in  the  heap  describing  invariants  of  objects  in  the  program 
logic;  and  the  classification  of  certain  addresses  as  values,  locks,  function 
pointers,  etc. 

Resource  maps,  or  just  rmaps  for  short,  are  the  program-logic  counter¬ 
part  to  CompCert  memories.  Their  type  is  abstract  (hidden  behind  a  Coq 
module  interface  in  file  VST /veric/ rmaps  .  v),  but  to  a  first  approxima¬ 
tion  think  of  rmaps  as  maps  from  CompCert  addresses  (block-offset  pairs) 
to  resources,  where  resources  generalize  CompCert  memvals  (abstract  bytes). 


ip  £  rmap  ~  address  — *  resource 


I  say  "only  to  a  first  approximation"  because  rmap  resources  will  contain 
assertions  (such  as  function  specifications  and  lock  invariants)  that  may 
quantify  over  the  rmap  itself.  Thus  rmaps  are  not  defined  directly  as  above, 
but  instead  using  step  indexing.  Chapter  39  of  [ADH+14]  gives  more  detail. 
A  paper  by  Hobor,  Dockins,  and  Appel  [HDA10]  explains  the  particular 
technique  used  (indirection  theory).  For  purposes  of  this  thesis,  the  step- 
indexed  details  of  the  Verifiable  C  model  are  not  critically  important.  When 
I  wish  to  indicate  that  rmap  p  contains  resource  res  at  location  l,  I  will  use 
syntax  p  @  l  =  res. 

The  resources  themselves  are  defined  inductively  as: 


res  £  Inductive  resource  :  Type  V 
|  NO  :  resource 

|  YES  :  pshare  — »  kind  — >  preds  — »  resource 
j  PURE  :  kind  — >  preds  — »  resource. 


A  NO  resource  indicates  no  access  to  a  location,  p  @  l  =  YES  n  k  pp 
asserts  a  resource  of  kind  k  with  program-logic  permission  n  and  (optional) 
predicates  pp  at  location  l.  pshare  stands  for  positive  (i.e.,  nonzero)  share. 


k  £  Inductive  kind  :  Type  = 
|  VAL  :  memval  — »  kind 
|  LK  :  Z  -)>  kind 
|  CT  :  Z  — >  kind 
|  FUN  :  funsig  — »  kind. 
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The  kinds  are  either  values  VAL,  for  CompCert  memval,  function  specifi¬ 
cations  FUN,  which  specify  function  pointers,  or  the  special  LK/CT  kinds, 
which  indicate  that  a  particular  (series  of)  locations  in  memory  serves  as  a 
semaphore.4 

The  PURE  resource  is  used  primarily  to  store  FUN  kinds,  with  the  pred¬ 
icates  pp,  which  may  quantify  over  the  rmap  itself,  storing  the  function 
pre-  and  postconditions.  YES  is  used  primarily  to  represent  actual  bytes  in 
memory. 

For  example,  the  program-logic  representation  of  a  CompCert  memval  v 
with  Freeable  permission  is: 

YES  T  (VAL  v)  NoneP 

where  T  is  the  topmost  share  in  Verifiable  C's  permission  lattice  and  NoneP 
represents  the  empty  list  of  predicates. 


Juicy  Memories:  Implementation.  In  veric/ juicy_mem .  v,  we  define 
juicy  memories  as  pairs  of  a  memory  m  and  an  rmap  (p.  The  rmap  and 
memory  must  be  consistent  with  each  other,  in  a  way  we  will  make  precise  in 
a  moment.  In  the  code,  we  represent  this  pair  with  the  following  inductive 
type. 


Inductive  juicy_mem  :  Type  = 
mkJuicyMem  :  V  (m:  mem)  (c p :  rmap) 
(JMcontents  :  contents.cohere  m  (p) 

(JMaccess  :  access.cohere  m  (p) 

(JMmax_access  :  max_access_cohere  m  (p) 
(JMalloc  :  alloc_cohere  m  <p). 
juicy  _mem. 

We  equip  the  type  juicy  unem  with  accessor  functions  of  the  form 


m_dry  (jm:  juicy.Mem)  =  case  jm  of  mkJuicyMem  m _ — >■  m 

m_phi  (jm:  juicy.Mem)  =  case  jm  of  mkJuicyMem  _  (p _ <P 


The  four  proof  objects  beginning  JM  . . .  enforce  the  four  consistency  re¬ 
quirements: 


4LK/ CT  are  used  only  in  the  Verifiable  C  extension  to  Concurrent  Separation 
Logic.  LK  is  the  resource  kind  associated  with  the  first  byte  of  a  4-byte  lock;  the 
remaining  three  bytes  of  every  lock  contain  CT  kinds. 
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Contents.  If  (p  @  l  =  (YES  n  (VAL  v )  pp)  then  m  l  =  v  and  pp  =  NoneP. 
That  is,  a  VAL  in  the  rmap  must  have  no  "predicates  in  the  heap" 
associated  with  it,  and  the  v  in  the  rmap  must  match  the  v  in  the 
CompCert  memory.  Predicates  will  only  occur  in  PUREs,  to  give  func¬ 
tion  specifications,  and  in  locks  (YES  n  (LK  k)  (SomeP  R))  to  give 
resource  invariants. 

Access.  For  all  locations  l,  m  l  =  perm_of_res  (<p  @  l).  The  fractional  share 
<p  @  l  must  "erase"  to  that  location's  CompCert  memory  permission. 
perm_of_res  is  a  simple  function  that  erases  Verifiable  C's  fractional 
shares  to  CompCert-style  permissions.  Chapter  42  of  [ADH+14]  pro¬ 
vides  additional  justification.  In  particular,  it  explains  why  erasing 
to  coarse-grained  permissions  when  connecting  to  CompCert  makes 
sense  for  Pthreads-style  concurrency. 

Max  Access.  For  all  locations  l, 

max_access_at  m  l  □  perm_of_sh  n  when  <p  @  l  =  YES  n 

max_access_at  m  l  □  perm_of_sh  _L  when  <p  @  l  =  NO 

fst  l  <  nextblock  m  when  3/  pp.  (p  @  l  =  PURE  f  pp. 

Alloc.  For  all  locations  /,  if  fst  l  >  nextblock  m  then  tp  @  l  =  NO.  CompCert 
treats  addresses  whose  abstract  base  pointer  is  beyond  nextblock  as 
not-yet- allocated.  Here  we  ensure  that  <p  makes  no  claim  to  those 
addresses. 

The  juicy-memory  consistency  requirements  are  mostly  straightforward. 
Max  Access  is  a  bit  more  complicated.  It  does  case  analysis  on  the  resource 
p@l,  ensuring  that  the  maximum  permission  in  m  at  a  given  location  is 
greater  than  or  equal  to  the  permission  corresponding  to  the  shares  tc 
or  _L.  As  I  explained  in  Chapter  2,  maximum  permissions  are  a  technical 
device  used  in  version  2  of  CompCert's  memory  model  to  express  invariants 
useful  for  optimizations  like  constant  propagation.  The  current  permission 
in  m  at  location  l,  or  just  permission,  is  always  less  than  the  maximum 
permission.  When  cp@l  contains  a  PURE  resource.  Max  Access  just  ensures 
that  l  is  a  location  that  was  allocated  at  some  point  (fst  l  <  nextblock  m). 
Here  nextblock  m  is  the  next  block  in  CompCert's  internal  free  list,  as  in 
Chapter  2. 

The  consistency  requirements  together  ensure  that  assertions  expressed 
in  the  Hoare  logic  on  the  <p  portion  of  the  juicy  memory  actually  say  some¬ 
thing  about  the  CompCert  memory  m.  For  example,  suppose  we  know — 

perhaps  because  (p  satisfies  the  assertion  1 1— >  v — that  p  contains  the  value 
v  with  share  n  at  location  l.  Then,  in  order  to  prove  that  a  load  from  m 
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at  location  l  will  succeed,  we  would  also  like  to  be  able  to  show  that  m 
contains  v  at  l,  with  at  least  readable  permission. 

To  validate  that  the  consistency  requirements  described  above  satisfy 
laws  of  this  form,  we  prove  such  a  lemma  for  each  of  the  basic  CompCert 
memory  operations:  load,  store,  alloc,  and  free.  For  example,  here  is  the 
lemma  for  mapsto  with  writable  share. 

Lemma  mapsto_can_store  : 

V  ch  v  b  z  jm  v' . 

(address.mapsto  ch  v  T  (b,z)  *  TT)  (m_phi  jm)  =^> 

3m1 ,  store  ch  (m_dry  jm)  b  z  v'  =  Some  m' . 

This  lemma  relies  on  the  consistency  requirements  to  prove  that  the  store  in 
m  dry  jm  will  succeed.  The  lemmas  for  the  other  memory  operations  differ 
in  the  predicate  on  m_phi  jm  but  are  otherwise  similar. 

In  addition  to  "progress"  lemmas  of  the  form  mapsto.can .store,  we  prove 
"preservation"  lemmas  for  juicy  memories.  That  is,  we  would  like  to  know 
that  after  each  CompCert  memory  operation  on  m  dry  jm,  yielding  a  new 
memory  m' ,  it  is  possible  to  construct  a  new  juicy  memory  jm'  such  that 
m_dry  jm'  =  rn' .  The  intuition  here  is  that  memory  operations  on  m_dry  jm 
never  touch  the  hidden  parts  of  m_phi  jm,  e.g.,  the  function  specifications  and 
lock  invariants  appearing  in  Hoare  logic  assertions.  Thus  it  is  possible  to 
construct  jm'  generically  from  rn'  and  m  phi  jm,  by  copying  hidden  data 
unchanged  from  m_phi  jm  to  m_phi  jm' ,  and  by  updating  m_phi  jm'  at  those 
locations  that  were  updated  by  the  memory  operation. 

For  example,  the  function  after_alloc/  defines  the  map  underlying  the 
new  m_phi  jm'  after  an  allocation  alloc  (m_dry  j)  lo  hi. 

after_alloc'  ( lo  hi:  Z)  ( b :  block)  (( p :  rmap)  ( H :  \/z.  (p  @  ( b,z )  =  NO) 

:  address  — >  resource  =  fun  l  — > 
if  adr_range_dec  ( b ,  lo)  (hi  —  lo)  l 
then  YES  T  pfu I Ishare  (VAL  Undef)  NoneP 
else  phi  @  l. 

Then  the  lemma 

Lemma  juicy_mem_alloc_at  : 

Vjm  lo  hi  jm'  b.  juicy _mem_alloc  jm  lo  hi  —  (jm' ,b)  =^> 

V  l.  m_phi  jm'  @  l  —  if  adr_range_dec  (b,  lo)  (hi  —  lo)  l 

then  YES  T  pfu llshare  (VAL  Undef)  NoneP 
else  m_phi  jm  @  l. 
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Semantics  G  C  juicy _mem 

initiaLcore 
at_external 
after.external 
halted 


=  initiaLcore  sem 
=  at_external  sem 
=  after.external  sem 
=  halted  sem 


corestep  ge  (c  :  C )  (jm  :  juicy.mem)  ( c '  :  C)  ( jm'  :  juicy.mem)  :  Prop  = 
corestep  sem  ge  c  (m_dry  jm)  c'  (m_dry  jm') 

A  resource_decay  (nextblock  (m_dry  jm))  (m_phi  jm)  (m_phi  jm') 

A  level  jm  =  level  jm'  +  1. 

Figure  7.1:  Juicy  interaction  semantics,  parameterized  by  an  underlying 
semantics  sem  :  Semantics  G  C  mem. 


gives  an  extensional  definition  of  the  contents  of  the  juicy  memory  jm' 
that  results.  Here  juicy_mem_alloc  uses  after.alloc7  to  construct  the  new  juicy 
memory  jm'  resulting  from  the  allocation. 


Juicy  Semantics.  It  is  possible  to  take  the  Clight  semantics  I  presented  in 
Chapter  3,  which  operated  on  CompCert  memories,  and  lift  it  to  operate 
on  juicy  memories  instead.  In  fact,  this  process  can  be  replicated  for  any 
interaction  semantics  operating  on  CompCert  memories  (Figure  7.1).  Here 
is  the  generic  construction  (file  VST/veric/ juicy_ext  spec  .  v): 

Assume  as  input  an  interaction  semantics  sem  :  Semantics  G  C  mem 
operating  on  CompCert  memories.  We  will  construct  a  new  interaction 
semantics,  J(sem),  by  defining  the  new  juicy  step  relation  jstep: 

jstep  G  C  ( sem  :  CoreSemantics  G  C  mem)  (ge  :  G) 

(c  :  C)  (jm  :  juicy_mem)  (c'  :  C)  ( jm '  :  juicy_mem)  :  Prop 
=  corestep  sem  ge  c  (m_dry  jm)  c'  (m_dry  jm') 

A  resource_decay  (nextblock  (m_dry  jm))  (m_phi  jm)  (m_phi  jm') 

A  level  jm  —  level  jm'  +  1. 

The  new  jstep  relation  embeds  the  corestep  relation  of  the  underlying  se¬ 
mantics,  projected  to  the  m_dry  components  of  the  initial  and  final  juicy 
memories  jm  and  jm' .  In  addition,  it  asserts  that 

•  m_phi  jm'  is  resource.decayed  from  m_phi  jm';  and 

•  the  level,  or  age,  of  jm'  is  one  less  than  the  age  of  jm  (stepping  reduces 
the  step  index  by  one). 

The  resource.decay  relation  is  a  bit  more  complicated: 
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resource_decay  ( nextb  :  block)  ((pi  (p2  :  rmap)  = 
level  (pi  >  level  (p2  A 

V  l  :  address,  fst  l  >  nextb  — =>  (pi  @  l  —  NO  A 

(*  either  no  change,  up  to  step  indices  *) 
resource_fmap  (approx  (level  (p2 ))  (c pi  @  l)  =  (c p2  @  /) 

(*  or  write  at  location  l  *) 

V  (3  v  v' .  resource.fmap  (approx  (level  (p2 ))  (c pi  @  l ) 

=  YES  T  (VAL  v)  NoneP 
A  cp2  @  l  —  YES  T  (VAL  v')  NoneP) 

(*  or  l  newly  allocated  in  (p2  *) 

V  fst  /  >  nextb  A  3  v.  (p2  @  l  —  YES  T  (VAL  v )  NoneP 

(*  or  l  freed  in  (p2  *) 

V  3  d  pp.  cpi  @  l  —  YES  T  (VAL  v)  pp  A  (p2  @  l  —  NO. 


The  relation  gives  an  extensional  interpretation  of  the  kinds  of  memory 
effects  that  may  occur  over  steps  from  rmap  (pi  =  m_phi  jrn  to  (p2  =  m_phi  jm' . 
At  every  location  l,  either  (p  \  @  1  =  (p2  @  l  (up  to  step-indexing,  resource.fmap, 
etc.),  or  there  was  a  write  at  l,  or  l  was  newly  allocated  in  (p2,  or  l  was  freed, 
resource  decay  is  often  used,  in  the  construction  of  the  Verifiable  C  model, 
to  reason  by  cases  on  the  Clight  jstep  relation.  Its  justification  is  the  fact 
that  all  operational  semantics  that  respect  the  CompCert  memory  model's 
interface  behave  in  this  way. 


Whole-Module  Correctness.  Now  that  I've  introduced  open-program  safety, 
juicy  memories,  and  the  J  operator  for  lifting  standard  CompCert  interac¬ 
tion  semantics  such  as  the  one  for  Clight  to  their  juicy  counterparts,  I  can 
finally  state  the  correctness  theorem  for  single-module  proofs  in  the  logic. 

It  is: 
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Theorem  8  (Soundness  of  Judgment  V,  G  Cfunc  mod  :  G). 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 


Vmod  V  G  (V  jm  f  fld  fb  fbody  v. 

let  ge  =  globalenv  mod  in 
let  O  =  funspecs.of  G  in 
let  7  sig_args  f  in 
let  r  =  sig  res  f  in 

let  sem  =  J(CLsem)  in 

V ,  G  I —fUnc  mod  .  G  V 

fun  .id  f  =  Some  fld  =^> 

find_symbol  ge  fid  =  Some  fb  ==■ 

find  Hu  net  ge  (Vptr  fb  Int.zero)  —  Some  fbody  =>■ 

Vx  :  ext  spec  type  O  f.  ext  spec  pre  O  f  x  (genv  symb  ge)  t  v  oj  jm 
= =>■  3c.  initiaLcore  sem  ge  (Vptr  fb  Int.zero)  v  =  Some  c 

A  juicy  safe  sem  (O  with  { ext  spec  exit  A  ext  spec  post  Of}) 
ge  co  c  jm 


Proof.  The  theorem  is  a  corollary  of  the  interpretation  of  semax_func.5  The 
machine-checked  proof  is  still  in  progress.  □ 


Essentially,  for  each  function  /  £  mod,  if  we  initialize  mod  at  entry  point  / 
in  a  state  satisfying  f's  precondition,  then  the  module  will  either  (safely) 
infinite  loop,  under  the  definition  of  safety  for  open  programs  given  in  this 
chapter,  or  terminate  in  a  state  satisfying  f's  postcondition. 

In  detail,  given 

•  a  program  module  mod, 

•  variable  and  global  function  specifications  V  and  G, 

•  external  state  co, 

•  juicy  memory  memory  jm, 

•  function  /,  and 

•  arguments  v 

if  /  is  defined  by  module  mod  (lines  8  to  10),  and  v,  co,  and  jm  satisfy  f's 
precondition  as  given  by  the  function  specification  O  (which  I  will  explain 
in  a  moment),  then  initializing  the  module  at  entry  point  /  results  in  an 
initial  core  state  c  that  is  (juicy)  safe  in  the  specification  that  updates  O  to 
have  ext_spec_exit  predicate  equal  f's  postcondition. 

How  is  O  defined?  Take  the  empty  set  of  specifications  and  add  to 
it  the  function  specifications  given  in  G.  For  example,  if  G  associates 


5File  VST/veric/semax_prog .  v. 
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funspec  ( A,  P ,  Q)  with  fui,  then  ext_spec_pre  (funspecs.of  G)  will  map  /  to 
P,  ext_spec_post  (funspecs.of  G)  will  map  /  to  Q,  and  so  on,  for  the  other 
function  specifiers  in  G. 6 


7.3  Composing  End-to-End 

How  do  we  compose  Theorem  8  with  the  results  in  Chapter  6,  on  verified 
separate  compilation? 

Figure  7.2  depicts  the  general  strategy.  At  the  top  of  the  diagram,  the 
user  has  proved  a  number  of  program-logic  semaxJunc  judgments  of  the 
form 

V r  G  \~func  'm,odjiCijX  .  G 

for  each  module  index  idx,  with  respect  to  the  semantics  J  (CLsem  )■  As  I 
will  prove  in  this  section,  these  independent  proofs,  with  respect  to  the 
shared  function  specifiers  G,  yield  (open-program)  safety  with  respect  to 
the  linked  semantics 

C{\modo\  j (cLSem)/  lmo<k\  j(cLSem),  •  •  •  /  [mod^-il  j(CLSem)) 

C  is  the  linking  operator  of  Chapter  4,  made  parametric  over  the  type  of 
memories  M  shared  by  the  semantics,  here  juicy  memories.7 

The  linked  program  £(...)  may  still  call  external  functions — those  spec¬ 
ified  by  the  oracle  O  but  not  defined  by  any  of  the  modules  0  to  N  —  1.  The 
next  step  is  to  close  over  these  external  functions,  by  constructing  a  Gallina 
context  J{C)  (cf.  Section  4.3)  that  "executes"  the  exteral  functions,  in  the 
manner  of  Section  4.3.  We  must  also  ensure,  for  the  next  step,  that  J  ( G)  is 
erasable,  by  which  I  mean  the  predicates  "checked"  by  C  are  erasable  to 


6See  VST/veric/semax_ext .  v  for  the  details.  In  that  file,  funspecs_of  is  called 
adcLfunspecs. 

7Linking  semantics  as  defined  in  Chapter  4  was  specialized  to  Comp- 
Cert  memories,  for  concreteness.  However,  CompCert  memories  were  by  no 
means  essential.  The  definition  of  parametric  linking  semantics  is  given  in 
file  VST/ linking/linking,  v. 
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V,G\~ June  rrwdo .  G 

V,G\~ func  mod\ .  G 

V ,G\~ func  modpj— i .  G 

Theorem  8 

Theorem  8 

Theorem  8 

4- 

4- 

safe  wrt.  O 

safe  wrt.  O 

safe  wrt.  O 

[mod0]  y(CLSem) 

[modi]J(CLsem)  ... 

Theorem  9 

\modN_  i]y(CLsem) 

safe  wrt.  (O  —  dom(plt)) 

£([mod0]  J( CLsem),  [modiJj(cLSem)/ •  •  • ,  [mod^-i]  j(CLSem)) 

Closure 

4- 

safe  £(J ( C),  [modo]  j(CLSem)/  [modi]  j(CLSem)/  •  •  •  /  [modAr-i]i7(cLSem) ) 


Theorem  10 

safe  C{  C,  [m,od0] cLSem/  [modi]CLsem,  •  •  • ,  {modN  ) 


Corollary  9 

$ 

safe  C(C,  [CC(m,od0)]AsmSem/  [CC(rnodi)]AsmSem,  •  •  •  /  [CC(modjv-i)]AsmSem) 


Figure  7.2:  Composing  the  proofs.  CC  abbreviates  CompCert.  O  is 
funspecs_of  G.  C  is  a  program  context  compatible  with  ( O  —  dom(plt)),  as 
described  in  the  text. 
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the  m_dry  component  of  juicy  memories.8  This  process  gives  us  safety  of 

£(J(C),  ImodoJ  j(CLSem)/  [modiljx CLSem),  •  •  • ,  [™4-i]  J(CLSem)) 

Finally,  we  erase  the  juicy  linked  semantics,  resulting  in  a  proof  of  safety 
for  semantics 

C(C,  {mod o]CLSem,  [modi]CLSem,  •  •  • ,  [rno^-llcLSem) 

in  which  C  is  the  erasure  of  context  J ( C),  and  each  module  modi . . .  mod n 
is  now  interpreted  in  the  "dry"  Clight  semantics  CLsem  against  which  Com¬ 
positional  CompCert  is  proved  correct. 

As  long  as  C  is  a  well-defined  context  (Definition  8,  Chapter  6),  the 
results  of  Chapter  6  give  us  that  C  is  safe  when  linked  with  the  (indepen¬ 
dently)  compiled  assembly  modules  CompCert(mo<ii), . . . ,  CompCert (modjv), 
assuming  compilation  succeeds  for  each  of  the  mod\  through  mod at. 


7.3.1  Safely  Linking 

Proving  the  erasure  theorem  I  described  above  is  relatively  easy.  Proving 
safety  of  the  linked  semantics,  from  the  per-module  program-logic  proofs, 
requires  a  bit  more  ingenuity.  Here  is  the  statement  of  the  main  theorem: 


8I  have  not  yet  implemented  closure  as  a  general  theorem  in  Coq.  The  sim¬ 
plest  strategy — and  the  one  supported  by  the  current  erasure  proofs — is  to  re¬ 
quire  that  the  specifications  of  those  functions  that  "escaped"  implementation  by 
mod\  ■  ■  ■  modN  be  expressible  purely  on  the  dry  part  of  the  state,  i.e.  the  CompCert 
memory.  These  dry  specifications  can  then  be  lifted  to  operate  on  juicy  memories 
as  was  done  for  juicy  semantics.  "Juicy"  specifications  can  still  be  used  in  the 
program  logic  to  prove  module  specifications.  In  ongoing  work  on  concurrency,  I 
have  had  initial  successes  with  erasure  of  more  complex  specifications. 
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Theorem  9  (Linking  Safety). 


1 

2 

3 

4 
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17 


ViV  pit  ( mod  :  //v  — >  ProgCL)  ge  main  idmam  V  G. 

let  O  =  funspecs  of  G  in 

let  O'  =  O  with  {ext_spec_exit  =  ext_spec_post  O  main}  in 
let  sems  =  A idx  :  / C-  mkModsem  ^/(CLsem)  (globalenv  modidx)  in 
( *  Postconditions  in  O  imply  return  values  are  well— typed...  *)  =>- 
(*  The  pit  is  well— formed  wrt.  global  environments...  *)  =>■ 

(*  Modules  contain  equal  global  symbol  tables  genv  symb  *)  =>■ 

(d idx  .  I]\f.  V ,  G  Cfunc  modi dx  •  (*)  r 
V X  CO  jm  idXmain  hmain  V. 

ext_spec_type  O  main  —  unit  =^- 
funJd  main  —  idmam  — => 
pit  Idmam  —  Some  idXmain  r 
find_symbol  (ge  (sernsidXmam))  idmain  =  Some  bmain 
ext_spec_pre  O  main  x  (genv_symb  ge)  (sig_args  main)  v  to  jm 
= =4>  31  :  LinkedState  N  sems. 

initial-core  ge  (Vptr  bmain  Int.zero)  v  —  Some  l 
A  safeN  (O'  —  dom(plt))  ge  (level  jm)  to  l  jm. 


Progd  is  the  type  of  Clight  programs  (program  Clight.fundef  type  in  the 
Coq  code).  Overall,  the  theorem  states  that  if  we  have  proved  each  module 
correct  in  the  logic  (line  8),  then  for  an  entry  point  main,  any  juicy  memory 
state  jm  (and  initial  arguments  v,  and  external  state  to,  etc.),  if  jm  satisfies 
the  precondition  for  main  (line  14),  then  initializing  the  linked  semantics 
at  main  succeeds  (line  16)  and  the  initial  state  is  safe  for  level-of-ym  steps. 
Why  is  s a f e - f o r- level -o f-jm- s te p s  sufficient,  as  opposed  to  safe-for-all-n?  We 
have  proved9  that  for  any  CompCert  initial  memory  m,  we  can  construct  a 
matching  juicy  memory  jm  such  that  m_dry  jm  =  m,  with  arbitrary  initial 
level  n.  Safety  is  with  respect  to  specification  O'  —  dom(plt),  the  function 
specifications  O'  minus  pre- /post-conditions  for  the  implemented  functions 
(those  in  the  domain  of  pit). 

The  three  assumptions  I  have  elided  are: 


Postconditions  imply  well-typed  return  values.  For  each  function  /  spec¬ 
ified  by  O,  f  s  postcondition  implies  that  the  values  /  returns  are 
well-typed  (with  respect  to  /' s  signature).  This  property  is  required 
for  compiler  correctness — e.g.,  in  register  allocation,  to  determine  in 


9Definition  initial_jm  in  file  VST/veric/initial_world.  v. 


7.3.  COMPOSING  END-TO-END 


133 


which  class  of  registers  to  stick  return  values.  It  shows  up  here  be¬ 
cause  the  property  is  built  in,  operationally,  in  linking  semantics  C. 

The  pit  is  well-formed.  Whenever  pit  fl(j  =  Some  idx  (that  is,  the  pit  claims 
that  fid  is  implemented  by  module  idx),  then  module  idx' s  global 
environment  contains  a  binding  for  /. 

Modules  contain  equal  global  symbol  tables  genv_symb.  Module  global  en¬ 
vironments  contain  equal  symbol  tables  (genv  symb,  mapping  global 
identifiers  to  the  addresses  at  which  they  are  allocated). 

This  assumption  is  not  unrealistic.  Assume  we  have  two  independent 
proofs,  in  the  Verifiable  C  logic,  that  modules  modi  and  mod, 2  are  safe 
with  respect  to  specification  oracle  O.  If  modi  and  mod, 2  declare  dif¬ 
ferent  (but  consistent)  sets  of  global  identifiers,  we  can  pre-process 
modi  and  mocfc  to  include  the  exact  same  sets  of  global  variable  and 
function  declarations,  in  the  same  order,  without  invalidating  the  as¬ 
sociated  Verifiable  C  proof  scripts  (these  can  be  re-run  unchanged  in 
the  larger  global  context). 

Proof.  The  proof10  depends  on  Theorem  8.  The  main  difficulty,  as  in  the 
proofs  of  the  linking  theorems  of  Chapter  6,  is  in  devising  an  invariant  on 
the  stack-of-cores  runtime  states  of  linking  semantics  that  is  strong  enough 
to  prove  safety  of  the  overall  linked  program.  Unlike  in  Chapter  6,  which 
employed  binary  invariants  on  source-target  linked  states,  the  invariant 
here  is  unary  (applies  to  a  single  program  state).  The  invariant  alLsafe 
(defined  in  VST/linking/safety  .  v)  has  shape: 

alLsafe  :  Q  — »  LinkedState  N  mod  — >  juicy unem  — »  Prop 
alLsafe  a 1  l  jm 

and  is  defined: 

alLsafe  00  (/  :  LinkedState)  jm  = 

3fs  :  list  (ident*typelist*type). 

Iast_frame_main  fs  A  stack_safe  fs  (stack  /)  a 1  jm 

Existentially  quantified  is  fs,  the  stack-trace  of  functions  that  have  been 
called  up  to  this  point.  Iast_frame_main,  the  first  conjunct  of  the  invariant, 
asserts  that  the  bottom-most  function  in  the  stack  is  main. 


10File  VST/linking/safety. v  states  the  safety  invariant  and  contains  the 
majority  of  the  proof;  VST/linking/semax_linking .  v  applies  the  theorem  to 
the  Verifiable  C  logic. 
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lastJrame_main  fs  = 
case  fs  of 
|  nil  — >  True 

|  /  ::  nil  — >  f  =  main 
|  _  ::  fs'  — >  last_frame_main  fs' 


stack  safe  asserts  safety  of  the  current  stack  of  cores  s  by  distinguishing  the 
head  core  c  (on  top  of  the  stack)  from  the  remaining  cores  s',  all  of  which 
are  at.external: 


stack_safe  fs  s  (o  jm  :  Prop  = 
case  fs,  s  of 
|  nil,  nil  — >  True 
|  /  ::  fs',  c  ::  s'  -A 

3x  :  ext_spec_type  O  f. 
heacLsafe  f  x  c  to  jm  A  taiLsafe  /  x  fs'  s'  jm 
False 


Safety  of  the  topmost  core  c  (heacLsafe)  is  defined: 


heacLsafe  /  ( x  :  ext_spec_type  O  f)  c  (O  jm  = 
let  idx  =  c.idx  in 

let  ge  =  ge  sems^x  in 
let  sem  =  sem  sems^x  'n 

let  O'  =  O  with  {ext_spec_exit  =  ext_spec_post  O  /}  in 
safeN  sem  O'  ge  (level  jm)  co  c  jm 


Recall  that  each  module  semantics  sems^x  defines  two  projections,  ge  for  the 
global  environment  associated  with  module  idx  and  sem  for  the  interaction 
semantics  of  idx.  heacLsafe  states  that  state  c  is  safe,  for  level  jm  steps,  with 
respect  to  its  associated  semantics  and  global  environment. 

The  predicate  taiLsafe  is  a  bit  more  involved: 
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1  taiLsafe  f  x  fs  s  jm  =  case  fs,  s  of 

2  |  nil,  nil  — »  True 

3  |  ftop  ■■  fs',  C  ::  s'  -A 

4  let  idx  =  c.idx  in 

let  ge  =  ge  sems^x  in 

6  let  sem  =  sem  semsidx  in 

7  let  O'  =  O  with  {ext_spec_exit  =  ext_spec_post  O  ftop}  in 

8  taiLsafe  ftop  xtop  fs'  s'  jm  A 

9  3v  a )  jntiQ  ( xtop  ■  ext_spec_type  O  ftop). 

10  R  (level  jm)  jrriQ  jm 

11  A  at.external  sem  c  —  Some  (e/,  v) 

A  ext_spec_pre  O  ef  x  (genv_symb  ge)  (sig.args  ef)  v  to  jniQ 

13  A  ( \/vret  jtn'  to' . 

14  R  (level  jm')  jrriQ  jm'  = ^ 

ext_spec_post  O  ef  x  (genv_symb  ge)  (sig_res  ef)  vret  to'  jm' 

16  3  c' .  after.external  sem  vret  c  —  Some  c' 

17  A  safeN  sem  O'  ge  (level  jm')  to'  c '  jm') 

is  False 


The  invariant  is  defined  recursively  on  fs  and  s.  It  states  that,  for  each  core 
c  in  s  (recall  that  the  s  here  is  the  tail  of  the  overall  linked-program  stack), 
(i)  c  is  at  external  (line  11),  and  (ii)  c  is  safe  for  any  return  value  vret  and 
states  jm' ,  to'  with  which  the  environment  may  return  (line  17),  assuming 
the  return  values/ states  are  in  relation  R  and  satisfy  the  postcondition  of 
function  ef  (lines  14  and  15).  The  relation  R  is  the  same  as  that  defined  in 
Section  7.1.2  (essentially,  level  jm  <  level  jmo). 


Once  the  alLsafe  invariant  has  been  defined,  the  brunt  of  the  work  of  the 
proof  is  to  show  the  following  progress /preservation  property: 
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Lemma  alLsafeJnvariant  to  (l  :  LinkedState  N  mod )  jm  : 
(*  if  invariant  holds  initially,  then  *) 

alLsafe  to  l  jm  =^>  level  jm  >  0  =>■ 


(*  either  (i),  linked  semantics  takes  a  corestep  and  invariant  is  reestablished  *) 

31'  jm' .  ge  L  l,jm'  l==>  V ,jm'  A  all  safe  to  l'  jm' 

(*  or  (ii),  semantics  is  halted  *) 

V  3vret.  halted  l  —  Some  vret 

(*  or  (iii),  can  reestablish  invariant  over  external  calls  *) 

V  3ef  v.  at_external  l  =  Some  (e/,  v) 

A  3x  :  ext_spec_type  O  ef. 

ext_spec_pre  O  ef  x  (genv_symb  ge)  (sig.args  ef)  v  to  jm 

A  Vvret  jm'  to'  n" . 
n"  <  level  jm'  = =>■ 

R  (level  jm')  jm  jm'  = =>■ 

ext_spec_post  O  ef  x  (genv_symb  ge)  (sig.res  ef)  vret  to'  jm'  = 

31' .  after.external  vret  l  —  Some  l'  A  alLsafe  to'  l'  jm' 

Assume  alLsafe  to  l  jm  initially.  Then  either: 

the  linked  semantics  takes  a  corestep,  in  which  case  we  can  reestab¬ 
lish  the  alLsafe  invariant, 

the  linked  semantics  is  (safely)  halted,  or 

the  linked  semantics  makes  a  truly  external  call,  i.e.,  to  a  function  de¬ 
fined  by  none  of  the  modules  in  the  program.  In  this  case,  we  must  be 
able  to  reestablish  the  invariant  for  any  return  values  and  memories 
satisfying  the  function  postcondition. 

□ 


7.3.2  Squeezing  the  (Princeton)  Orange 

One  can  "squeeze  the  orange",11  in  order  to  extract  the  juice  from 

[modo]y(CLSem)'  ImodiJ  J(CLSem)/ •  ■  •  /  [modjV_i]t7(CLSem)) 

11  David  Walker  refined  this  metaphor.  Andrew  Appel  originally  suggested  the 
name  "juicy  memories". 


7.3.  COMPOSING  END-TO-END 


137 


by  proving  that  the  juicy  linked  semantics  is  simulated  by  its  "dry"  analog 
£(C,  [modolcLsem/  [m°di]cLsem/  •  •  •  /  [modyv-iJcLseJ 
The  general  form  of  the  theorem  is: 

Theorem  10  (Linked  Erasure).  Let  serin®,  semi, .  ■  •  ,semjv-i  be  interaction  se¬ 
mantics  operating  on  CompCert  memories,  with  Jjsemf)  the  corresponding  lifted 
juicy  semantics.  Then  there  is  a  whole-program  simulation 

jC(J(sem0),J(semi),...,J(semN-i)) 

<  Cjsem®,  sem\, . . . ,  semjn-i) 


Proof.  In  Coq.12  □ 

As  a  corollary  of  Theorem  10  and  Corollary  4,  we  get  that  erasure  is  safety¬ 
preserving. 


12File  VST/ link! ng/erase_juice 
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Application  to  CompCert 


The  techniques  of  this  thesis  have  been  applied  to  the  CompCert  certified  C 
compiler  (version  2.1).  The  result  is  Compositional  CompCert,  the  codebase 
of  which  is  open  source  and  freely  available  on  GitHub.1 


8.1  Compositional  CompCert 

The  proved-correct  phases  of  the  Compositional  CompCert  compiler  are 
shown  in  Figure  8.1,  with  optimization  phases  in  gray.  The  main  differ¬ 
ences  with  standard  CompCert  are:  (1)  We  compile  Clight  to  x86  assem¬ 
bly,  whereas  standard  CompCert  compiles  a  slightly  higher-level  language 
(CompCert  C)  to  multiple  assembly  targets  (x86,  PowerPC,  and  ARM);  and 
(2)  standard  CompCert  includes  three  additional  RTL-level  optimizations 
(common  subexpression  elimination,  constant  propagation,  and  function 
inlining);  the  adaptation  of  their  proofs  is  ongoing  work.  The  toplevel  theo¬ 
rems  we  prove  are  the  following. 


Theorem  11  (Compiler  Correctness).  Let  CompCert  denote  the  compilation 
function  that  composes  the  phases  in  Figure  8.1.  If  CompCert(S')  =  Some  T,  for 
Clight  module  S  and  x86  module  T,  then  [5]  -<  [T]. 


1https : //github . com/PrincetonUniversity/compcomp 
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Figure  8.1:  The  phases  of  Compositional  CompCert.  Boxes  in  gray  are 
optimization  passes.  Outer  boxes  indicate  source  languages. 


Proof.  By  transitive  composition  of  the  simulation  proofs  for  the  individual 
phases  in  Figure  8.1  using  Theorem  4.2  □ 

Corollary  9  (Compositional  Compiler  Correctness).  Let  Pg  =  So,  Si,  . . . , 
Sjv- 1  be  a  set  ofClight  modules  with  equal  global  domains  such  that  CompCert(S'i)  = 
Some  Ttfor  each  i.  Let  PT  abbreviate  the  target  program  Tq,  T\, ,  Ijv- i-  Then 
Ps  Pt- 

Proof.  Theorem  11  establishes  the  simulations  {Si}  A  [T*].  By  Corollary  7, 
we  get  that  P$  ~rc  Pt-3  The  side  conditions  of  Corollary  7  are: 

•  for  all  i.,  [S'*]  is  reach-closed  (Theorem  1,  Clight  is  reach-closed); 

•  for  all  i,  [Tj]  is  valid  (Theorem  2,  x86  is  valid);  and 

•  for  all  deterministic  contexts  C,  the  linked  semantics  C  ( C ,  [Pr])  is 
also  deterministic  (follows  by  Theorem  3  and  determinism  of  Comp¬ 
Cert  x86  assembly). 

□ 


We  also  get  the  following  contextual  refinement. 


2The  Coq  proof  is  file  compcomp/ driver/CompositionalCompiler  .  v. 
3The  Coq  proof  is  file  compcomp/linking/ CompositionalComplements  .  v. 
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Corollary  10  (Contextual  Refinement  for  CompCert).  Let  P$  =  So,  S\,  ... , 
Sn- i  be  a  set  of  reach-closed  Clight  modules  with  equal  global  domains  such  that 
CompCert(,S'.,;)  =  Some  Tt  for  each  i.  Let  P y  abbreviate  the  target  program  To, 
T\, ...,  Tjsr- 1-  Then  Pt  Crc  P$. 

Proof.  By  Theorem  11  and  Corollary  8. 4  The  side  conditions  are  the  same 
as  in  Corollary  9.  □ 


8.2  Anatomy  of  a  Phase 

Converting  a  CompCert  phase  to  structured  simulations  typically  pro¬ 
ceeded  as  follows:  Refine  CompCert's  internal  match-state  relation  (and 
the  auxiliary  relations  for  activation  records,  frame  stacks,  etc.)  to  relations 

indexed  by  structured  injections.  In  particular,  because  external  function 
call  interactions  may  introduce  memory  regions  related  by  memory  injec¬ 
tions  in  Compositional  CompCert,  the  simulation  relations  of  passes  that 
were  previously  proved  as  equality  (or  extension)  phases  had  to  be  reformu¬ 
lated  as  injections.  Particular  care  was  needed  to  assign  correct  ownership 
and  visibility  information  to  compiler-introduced  memory  blocks. 

In  addition,  add  to  each  relation  the  clauses:  vis,,  is  closed  under 

reachability,  and  the  relation  is  closed  under  restriction  to  the  visible  set 
(p  (vis,,)-  To  ensure  that  global  blocks  were  always  mapped  by  each  compiler 
phase,  we  treated  them  as  Frgn  to  all  modules.  While  the  addition  of  these 
extra  invariants  proceeded  in  a  mostly  uniform  manner  across  all  phases, 
the  refinement  of  to  ~  was  phase-by-phase,  due  to  the  considerable 
internal  differences  between  the  various  CompCert  passes. 

In  total,  porting  the  CompCert  phases  in  Figure  8.1  to  structured  sim¬ 
ulations  took  approximately  10  person-months,  though  much  of  this  time 
was  spent  at  the  "boundaries"  of  the  proof,  updating  the  interfaces  that 
connected,  in  particular,  our  linking  semantics  and  proofs  to  structured 
simulations.  In  general,  the  porting  time  decreased  as  as  the  project  went 
on.  Adapting  the  first  few  phases  of  the  compiler  took  a  few  weeks  to  a 
month  per  phase,  whereas  the  later  phases  went  much  more  quickly  (a  day 
or  two  per  phase).  This  was  due  in  part  to  greater  familiarity  with  Comp¬ 
Cert,  but  also  to  the  accumulation  of  a  library  of  general-purpose  lemmas 
on  structured  injections  and  simulations,  which  will  remain  useful  as  we 
continue  to  adapt  the  last  few  optimization  passes. 


4The  Coq  proof  is  file  compcomp/linking/ CompositionalComplements  .  v. 
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8.3  Anatomy  of  the  Proof 

As  an  (albeit  imperfect)  measure  of  the  amount  of  effort  involved  in  building 
the  mechanized  development  that  accompanies  this  thesis,  I  report  lines-of- 
code  for  selected  representative  files  in  the  development  (Figure  8.2), 

For  Compositional  CompCert,  proofs  of  individual  phases  ("new")  were 
on  the  order  of  5klocs.  By  contrast,  CompCert  2.1's  ("old")  proofs  are  about 
2  x  smaller.  The  increase  in  proof  lines  is  due  mostly  to  the  additional  invari¬ 
ants  we  prove.  However,  we  have  not  yet  applied  much  proof  automation 
at  all,  so  we  believe  there  is  room  for  improvement. 

The  increase  in  specification  size  is  due  to  the  use  of  duplicate  language 
definitions:  In  order  to  add  effects  to  the  CompCert  languages  we  duplicate 
the  step  relation  of  each  semantics  (once  with,  and  once  without,  effects), 
then  prove  that  the  two  semantics  coincide.  This  results  in  specification 
counts  that  are  larger  than  necessary. 

The  underlying  theories  are  on  the  order  of  4, 000  lines  of  specifications 
and  approximately  32,000  lines  of  proofs,  spread  across  more  than  1,400 
distinct  lemmas.  The  code  and  proofs  corresponding  to  Chapter  7  (not 
shown  in  the  table)  total  approximately  3, 500  lines. 
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Specs. 

Proofs 

Lemmas 

Compiler  Phases: 

old 

new 

old 

new 

old 

new 

SimplLocals 

725 

979 

2168 

4670 

71 

126 

Csharpminorgen 

1201 

1634 

1450 

3207 

65 

96 

Cminorgen 

1619 

1635 

2796 

5041 

85 

112 

Selection 

1663 

1463 

3239 

5926 

145 

248 

RTLgen 

961 

1364 

1475 

4812 

48 

97 

Tailcall 

441 

643 

628 

1713 

19 

32 

Renumbering 

441 

643 

267 

1428 

13 

38 

Allocation 

765 

1273 

2197 

4390 

93 

124 

Tunneling 

324 

630 

417 

1941 

14 

51 

Linearize 

606 

1359 

750 

2432 

35 

73 

Cleanup  Labels 

282 

729 

372 

2067 

15 

54 

Stacking 

712 

1685 

2906 

6713 

107 

182 

Asmgen 

1326 

1970 

2863 

5370 

105 

125 

Theories: 

Interaction  Sems.  (Chapter  3) 

75 

167 

16 

Trace  Sems.  (Chapter  3) 

270 

- 

- 

Gallina  Sems.  (Chapter  3) 

58 

- 

- 

Linking  (Chapters  4  and  6) 

2454 

8469 

481 

Whole-Program  Sims.  (Chapter  5) 

118 

594 

13 

Structured  Injs.  (Chapter  5) 

55 

2099 

182 

Structured  Sims.  (Chapter  5) 

349 

8056 

487 

Transitivity  (Chapter  6) 

105 

5285 

49 

Valid  Semantics  (Chapter  6) 

234 

- 

- 

Reach-Closed  Semantics  (Chapter  6) 

270 

- 

- 

Well-Defined  Contexts  (Chapter  6) 

89 

- 

- 

Clight  Well-Defined  (Chapter  6) 

- 

7035 

181 

Figure  8.2:  Lines  of  code  for  selected  parts  of  the  development.  "Lemmas" 
is  number  of  theorems  proved. 


Chapter 


9  - 

Conclusion 


9.1  What  Has  Been  Achieved? 

This  dissertation  set  out  to  give  a  semantic  characterization  of  the  program 
contexts  for  which  an  optimizing  C  compiler  is  sound,  answering  the  ques¬ 
tion  For  which  program  contexts  is  an  optimizing  C  compiler  correct?  In  order  to 
do  so,  it  developed 

interaction  and  linking  semantics  (Chapters  3  and  4),  which  made  it  pos¬ 
sible  to  state  compiler  correctness  as  cross-language  contextual  equiv¬ 
alence  (Section  4.2,  refined  in  Chapter  6),  showing  hoiv  to  achieve 
language-independence;  and 

structured  simulations  (Section  5.3.2),  an  extension  of  CompCert's  forward 
simulation  proof  method  that  composes  both  transitively,  across  com¬ 
piler  phases  (Chapter  6,  Theorem  4),  and  horizontally,  across  sepa¬ 
rately  compiled  modules  (Chapter  6,  Theorem  5),  answering  the  ques¬ 
tion  How  to  reason  about  equivalence  of  open  modules? 

In  answer  to  the  question  Do  the  techniques  scale  to  realistic  languages  like 
C  and  to  real  systems  like  the  CompCert?  the  above  techniques  were  applied 
to  build  Compositional  CompCert  (Chapter  8  and  [SBCA14]),  a  verified 
separate  compiler  for  C.  In  addition,  I  showed  (Chapter  7)  how  to  connect 
the  Verifiable  C  program  logic  [ADH+14]  to  Compositional  CompCert, 
yielding  a  method  for  modular  proving  of  compiled  C/assembly  programs. 
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9.2  Discussion 

The  techniques  of  this  dissertation  have  applicability  beyond  just  C/ assembly- 
language  programs,  separate  compilation,  or  just  CompCert. 

Interaction  semantics  are  a  natural  tool  for  expressing  more  complex 
modes  of  interaction  than  the  linking  semantics  of  Chapter  4.  For  example, 
the  following  code 

Record  ConcurrentState  ( threads  :  thread  idx  — >  ThreadSem)  = 
mkConcurrentState  { 
schedule  :  N  — >  threadJdx ; 
permissionOracle  :  IN  — >  Set  location ; 

threadStates  :  \/tid  :  thread  Jdx.  option  (CoreState  ( threads  tid )) 

} 

sketches  a  possible  adaptation  of  the  LinkedState  record  of  Chapter  4,  which 
modeled  the  state  of  a  linked  program,  to  a  collection  of  concurrent  threads. 
Thread  semantics  are  defined  by  a  parameter  threads  that  maps  thread 
indices  thread  Jdx  to  records  giving  each  thread's  interaction  semantics. 
The  components  of  ConcurrentState  might  include  the  schedule,  a  stream  of 
thread  indices  associated  with  each  timestep  of  the  concurrent  execution, 
a  permission  oracle  (derived  from,  e.g.,  a  program  safety  proof  in  Con¬ 
current  Separation  Logic)  describing  the  ownership  transfers  that  must 
occur  at  each  lock/unlock  operation,  as  well  as  a  map,  threadStates,  of  the 
core  states  associated  with  each  active  thread.  It  is  not  difficult  to  imag¬ 
ine  an  adaptation  of  the  interaction  semantics  C  of  program  linking  to 
this  (coarse-grained)  concurrent  setting.  An  important  point  is  that  even  in 
coarse-grained  Pthreads-style  concurrency,  interactions  among  threads  oc¬ 
cur  only  at  external  function  call  points  (lock/unlock  are  themselves  just 
external  functions).  It  is  likely  that  most,  if  not  all,  "external-function-call- 
like"  protocols  could  be  modeled  in  the  interaction  semantics  framework. 

My  work  on  separate  compilation  for  C-like  languages  also  exposed 
some  warts  of  C.  For  example,  much  of  the  complexity  of  structured  simu¬ 
lations  (Chapter  5)  was  driven  by  C-language  "features"  such  as  addressed 
stack-allocated  local  variables.  When  pointers  to  compiler-managed  data 
such  as  stack  frames  escape  to  external  modules,  one  must  keep  careful 
track,  in  compiler  invariants,  of  those  parts  of  memory  that  may  be  up¬ 
dated  by  external  functions.  The  invariants  and  proofs  would  have  been 
simpler  in  a  more  restrictive  language  setting,  in  which  compiler-managed 
data  was  instead  guaranteed  private. 

From  an  engineering  standpoint,  the  adaptation  of  CompCert  2.1  to 
interaction  semantics  and  structured  simulations  (Chapter  8)  could  have 
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been  simplified  considerably,  in  retrospect,  if  my  colleagues  and  I  had  first 
built  a  uniform  interface  to  CompCert's  compiler-phase  proofs.  It  is  not 
immediately  obvious  that  such  an  interface  is  possible.  However,  the  proofs 
do  share  many  features.  For  example,  CompCert's  simulation  invariants 
often  have  three  constructors:  One  for  the  normal  "running  state"  case,  one 
for  making  function  calls,  and  one  for  returning  from  calls.  Each  of  these 
cases  usually  decomposes  into  a  predictable  set  of  sub-invariants  on,  e.g., 
the  stack  frame  and  temporaries  environment.  Spending  some  engineering 
effort  up  front  to  expose  this  structure  might  have  made  it  possible  to  adapt 
the  compiler  proofs  to  the  Compositional  CompCert  framework  in  a  more 
uniform  fashion. 


9.3  Future  Directions 

Heterogeneous  Verified  Systems.  The  verification  techniques  I  described 
in  Chapter  7 — with  which  modules  are  proved  independently  in  the  Veri¬ 
fiable  C  logic,  against  a  common  specification  O,  and  then  proved  sound 
with  respect  to  the  C  semantics  of  Chapter  4 — have  so  far  targeted  mostly-C 
programs.  The  approach  supports  more  heterogeneous  systems,  in  which 
some  modules  are  in  C,  some  are  in  assembly,  and  others  are  in  a  third 
language  such  as  Coq's  Gallina.  However,  the  verification  techniques  are 
not  optimized  for  such  highly  heterogeneous  programs — modules  in  lan¬ 
guages  other  than  C  (e.g.,  those  in  CompCert  x86  assembly)  must  be  proved 
directly  from  their  operational  semantics. 

It  would  be  convenient  to  provide  support  for  other  program  logics, 
besides  just  Verifiable  C.  For  example,  one  might  prove  assembly  modules 
correct  in  a  variant  of  XCAP  [NS06]  or  in  a  modified  Bedrock  [Chill], 
adapted  to  CompCert  x86  assembly.  There  are  no  major  technical  limitations 
here.  Chapter  7's  semantics  preservation  proofs  were  designed  to  be  mostly 
independent  of  the  particular  program  logic  used  to  establish  per-module 
safety.  The  major  interdependency  is  the  shared  specification  language  O. 

More  radical  is  to  provably  compile  programs  that  include  modules 
written  in  a  high-level  language  like  Gallina,  in  addition  to  C  and  assembly. 
One  could  use  Coq's  current  code  extraction  to  compile  the  Gallina  mod¬ 
ules  to  OCaml,  and  then  further  compile  with  ocamlopt.  However,  this 
process  yields  an  unverified  toolchain  (Coq  extraction,  ocamlopt,  and  the 
OCaml  runtime  must  all  be  trusted).1  Another  (better)  solution  is  to  build 
a  verified  compiler  for  Coq  itself,  a  project  currently  underway  at  Prince- 


1  One  could  argue  that,  as  Coq  users,  we  must  trust  the  latter  two  already. 
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ton.  This  "CertiCoq"  could  then  be  fruitfully  combined  with  Compositional 
CompCert  to  yield  a  certified  compiler  for  Gallina/C/ assembly  programs. 

The  motivation  here  is  efficiency  of  the  program  verifier,  by  which  I 
mean  the  human  actually  guiding  the  proof  assistant.  Certain  low-level 
software  components,  such  as  garbage  collectors  and  OS  kernels,  are  more 
suited  to  implementation  at  the  C  level  of  abstraction.  However,  the  cost  of 
verification  is  proportionately  higher  at  this  level.  Other  components — such 
as  the  "glue"  code  that  composes  large  sections  of  many  software  systems — 
are  more  conveniently  implemented  and  proved  in  a  purely  functional 
language  like  Gallina  with  a  clean  proof  theory. 

Syntactic  Linking.  The  linking  operator  C  first  introduced  in  Chapter  4  is 
semantic,  in  the  sense  that  it  takes  as  arguments  not  the  syntax  of  its  input 
modules  but  instead  their  associated  interaction  semantics,  and  produces  a 
new  interaction  semantics  as  result.  There  is  an  alternative  kind  of  syntactic 
linking  that  operates  directly  on  the  syntax  of  modules,  e.g.,  by  syntactically 
concatenating  a  number  of  program  fragments,  all  of  which  must  be  in  the 
same  language.  In  general,  semantic  linking  provides  more  flexibility;  for 
instance,  it  enables  linking  of  modules  in  a  variety  of  languages,  as  long  as 
the  domain  of  interpretation  is  shared  across  modules.  On  the  other  hand, 
cross-language  syntactic  linking  does  not  make  sense  (a  C  module  cannot 
be  concatenated  to  an  assembly  module;  the  types  do  not  match). 

Syntactic  linking  is  nevertheless  useful.  It  would  support  certain  cross¬ 
module  optimizations  such  as  external  function  inlining  (compile  two  mod¬ 
ules  to  a  common  intermediate  language;  link  syntactically;  then  do  stan¬ 
dard  intramodule  inlining).  A  syntactic  linking  proof  at  the  x86  assembly 
level  would  also  provide  additional  justification — beyond  the  arguments 
already  presented  in  Chapter  3 — of  Compositional  CompCert's  "copying" 
x86  interaction  semantics.  This  assembly-level  syntactic  linking  proof  might, 
in  addition,  provide  a  means  of  integration  into  projects  such  as  Shao  et  al.' s 
CertiKOS  certified  microkernel  [GVF+11,  GKR+15],  which  currently  uses 
a  modified  CompCert  compiler  to  translate  C  functions  to  assembly  code, 
in  the  context  of  assembly-level  callers.  This  modified  CompCert  compiler 
does  not  yet  support  address-taken  local  variables — one  of  the  aspects  of 
C  that  so  complicated  the  Compositional  CompCert  proofs  (cf.  Chapters  1 
and  5). 

I  have  done  initial  experiments  with  syntactic  linking.2  The  idea — rather 
than  reproving  syntactic  linking  for  each  language — is  to  define  a  general 
proof  infrastructure  that  depends  on  (i)  a  monoid  c\  o  C2  on  corestates  of  the 


2File  comp comp/ linking/ stacking . v. 
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language,  modeling  abstract  stack-frame  composition  in  linking  semantics 
£;3  and  (ii)  proofs  that  the  corestep  relation,  at.external,  after.external,  etc. 
are  compatible  with  the  monoid.  The  simulation  that  relates  the  abstract  C 
activation-record  stack  s  to  the  actual  stack  s'  (of  the  syntactically  linked 
whole  program)  states  that  s'  is  the  fold  of  the  monoid  operator  o  over  s 
(lifted  to  option,  with  unit  None). 

9.4  Conclusions 

Tony  Hoare,  in  2005,  called  the  verifying  compiler  one  of  the  "grand  chal¬ 
lenges"  [HM05]  of  computer  science — on  par  with  Fermat's  last  theorem 
(in  math)  and  P  vs.  NP.  Such  comparisons  are  perhaps  a  bit  overblown. 
Regardless,  Leroy's  CompCert,  the  first  verified  optimizing  compiler  for  a 
realistic  language,  was  a  definitive  milestone.  This  thesis,  while  definitively 
not  the  last  word  on  compiler  verification,  has  advanced  the  art  by  a  few 
(small)  steps. 


3In  a  language  like  C,  the  monoid  is  defined,  for  example,  as  list  append  on 
activation-record  stacks. 
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